In 1996 and 1998, Hillis published two studies where he used the 18S rDNA-based angiosperm model tree from Soltis et al. to simulate DNA sequences of various length. He calculated a probability of finding the correct model tree depending of methods of analysis, number of characters and substitution rates. He found that big trees are more easily recovered than previously thought as only 5000 variable characters were required to give 100% chance of finding the correct tree and that higher substitution rates facilitate tree searches. We evaluated the same parameters, in addition to the performance of codon positions, for recovering several large trees based on atpB, rbcL, and 18S rDNA. We found that contrary to Hillis's studies, larger trees were not easier to recover than smaller ones, and based on rbcL and atpB, we plateau off at about 80% of the tree correct. However based on 18S rDNA for various tree sizes, we obtained similar results to Hillis, making 18S rDNA a particular case. To evaluate the feasibility of building a complete generic level angiosperm phylogenetic tree based on molecular data, we have performed similar simulations with trees comprising several thousand taxa, and we will discuss the probability of finding these very large model trees depending on sequence/tree parameters used in the simulations.

Key words: angiosperms, large trees, simulations