eCite Digital Repository

Sampling trees from evolutionary models


Hartmann, K and Wong, D and Stadler, T, Sampling trees from evolutionary models, Systematic Biology, 59, (4) pp. 465-476. ISSN 1063-5157 (2010) [Refereed Article]

Restricted - Request a copy

Copyright Statement

The definitive publisher-authenticated version is available online at:

Official URL:

DOI: doi:10.1093/sysbio/syq026


Abstract.—A wide range of evolutionary models for species-level (and higher) diversification have been developed. These models can be used to test evolutionary hypotheses and provide comparisons with phylogenetic trees constructed from real data. To carry out these tests and comparisons, it is often necessary to sample, or simulate, trees from the evolutionary models. Sampling trees from these models is more complicated than it may appear at first glance, necessitating careful consideration and mathematical rigor. Seemingly straightforward sampling methods may produce trees that have systematically biased shapes or branch lengths. This is particularly problematic as there is no simple method for determining whether the sampled trees are appropriate. In this paper, we show why a commonly used simple sampling approach (SSA)—simulating trees forward in time until n species are first reached—should only be applied to the simplest pure birth model, the Yule model. We provide an alternative general sampling approach (GSA) that can be applied to most other models. Furthermore, we introduce the constant-rate birth–death model sampling approach, which samples trees very efficiently from a widely used class of models.We explore the bias produced by SSA and identify situations in which this bias is particularly pronounced. We show that using SSA can lead to erroneous conclusions: When using the inappropriate SSA, the variance of a gradually evolving trait does not correlate with the age of the tree; when the correct GSA is used, the trait variance correlates with tree age. The algorithms presented here are available in the Perl Bio: Phylo package, as a stand-alone program TreeSample, and in the R TreeSim package. [Algorithms; distribution; evolutionary models; phylogenetic trees; sampling; simulating.]

Item Details

Item Type:Refereed Article
Keywords:Algorithms distribution evolutionary models phylogenetic trees sampling simulating
Research Division:Mathematical Sciences
Research Group:Applied mathematics
Research Field:Biological mathematics
Objective Division:Expanding Knowledge
Objective Group:Expanding knowledge
Objective Field:Expanding knowledge in the biological sciences
UTAS Author:Hartmann, K (Associate Professor Klaas Hartmann)
ID Code:65914
Year Published:2010
Web of Science® Times Cited:53
Deposited By:TAFI - Marine Research Laboratory
Deposited On:2010-12-09
Last Modified:2015-01-27

Repository Staff Only: item control page