eCite Digital Repository

Treeness Triangles: Visualising the Loss of Phylogenetic Signal

Citation

White, WT and Hills, SF and Gaddam, R and Holland, BR and Penny, D, Treeness Triangles: Visualising the Loss of Phylogenetic Signal, Molecular Biology and Evolution, 24, (9) pp. 1529-1541. ISSN 0737-4038 (2007) [Refereed Article]


Preview
PDF
Restricted - Request a copy
632Kb
  

Copyright Statement

The definitive publisher-authenticated version is available online at: www.oxfordjournals.org

DOI: doi:10.1093/molbev/msm139

Abstract

It is well known that molecular data "saturates" with increasing sequence divergence (thereby losing phylogenetic information) and that in addition the accumulation of misleading information due to chance similarities or to systematic bias may accompany saturation as well. Exploratory data analysis methods that can quantify the extent of signal loss or convergence for a given data set are scarce. Such methods are needed because genomics delivers very long sequence alignments spanning substantial phylogenetic depth, where site saturation may be compounded by systematic biases or other alternative signals. Here we introduce the Treeness Triangle (TT) graph, in which signals detectable by Hadamard (spectral) analysis are summed into 3 categories—those supporting 1) external and 2) internal branches in the optimal tree, in addition to 3) the residuals (potential internal branches not present in the optimal tree). These 3 values are plotted in a standard ternary coordinate system. The approach is illustrated with simulated and real data sets, the latter from complete chloroplast genomes, where potential problems of paralogy or lateral gene acquisition can be excluded. The TT uncovers the divergence-dependent loss of phylogenetic signal as subsets of chloroplast genomes are investigated that span increasingly deeper evolutionary timescales. The rate of signal loss (or signal retention) varies with the gene and/or the method of analysis.

Item Details

Item Type:Refereed Article
Keywords:plastid genomes • spectral analysis • model misspecification • exploratory data analysis • ternary plot • Hadamard conjugation
Research Division:Information and Computing Sciences
Research Group:Computer Software
Research Field:Bioinformatics Software
Objective Division:Expanding Knowledge
Objective Group:Expanding Knowledge
Objective Field:Expanding Knowledge in the Biological Sciences
Author:Holland, BR (Associate Professor Barbara Holland)
ID Code:62970
Year Published:2007
Web of Science® Times Cited:17
Deposited By:Mathematics
Deposited On:2010-03-31
Last Modified:2012-03-07
Downloads:0

Repository Staff Only: item control page