LITTLE, DAMON P.* and KEVIN C. NIXON. L.H. Bailey Hortorium, 462 Mann Library, Cornell University, Ithaca, NY 14853. - Speed, efficiency, and more data in cladistic analysis.
The terms phylogenetic efficiency and phylogenetic accuracy have often
been confused in the literature. Accuracy is a measure of truth while
efficiency is the ratio of computer time devoted to conduct an
analysis to the frequency that optimal trees are obtained. It has been
argued, using an empirical example, that additional character data
will actually decrease the overall analysis time and therefore
increase efficiency. Although this example claimed a decrease in the
time required for a tree search, the success or failure of each tree
search was not accounted for. Thus, this example did not in fact
demonstrate an improvement in efficiency. For our study the three gene
angiosperm matrix (567 taxa; rbcL, atpB, and 18S sequence) was used as
a model data set. We found: (1) Not all methods of analysis have the
same success rate. The types of traditional searches that can be
completed in a "reasonable" amount of time have a near zero
success rate and therefore are totally inefficient while
non-traditional methods (e.g. the parsimony ratchet) are much more
efficient. (2) There is no one method that is most efficient for all
data sets. In some cases tree fusion is much more efficient than the
ratchet (18S) while in other cases the reverse is true (rbcL, atpB).
(3) The pattern of homoplasy within a data set has a greater impact on
the efficiency of analysis than the number of characters. (4) Since
additional characters slow the analysis (the increase is linear) the
only way additional characters can make the analysis more efficient is
if the characters increase the decisiveness of the data.
Key words: analysis, cladistic, efficiency, phylogenetic, speed, tree search