I describe a general likelihood-based ‘mixture model’ approach for inferring phylogenetic trees from gene-sequence or other character- state data. Conventional models of gene-sequence evolution assume that all sites evolve according to a single homogeneous model of evolution or that rates of evolution vary among sites according to some statistical distribution, such as the gamma. In a phylogenetic context, a mixture model allows for more than one model of the evolutionary process to be fitted to each site in the alignment, with the likelihood being summed over models at each site. I describe how mixture modelling can accommodate cases in which different sites in the alignment evolve in qualitatively distinct ways, a phenomenon we call ‘pattern heterogeneity’. Mixture models can also accommodate sites whose rate of evolution varies in different parts of the tree. This is often called ‘heterotachy’ and I show how a mixture model approach can improve on a parametric covarion model for heterotachous data, and how the model can be used to identify which sites are evolving heterotachously and where in the tree this behaviour occurs. The mixture model does not require prior knowledge of the patterns or of the nature of the heterotachy in the data, nor does it partition the data. We present studies to show that the model correctly retrieves known patterns from simulated gene-sequence data, and present evidence that it generally improves the likelihood of the data substantially over simpler models of gene-sequence evolution. Mixture modelling seems a promising technique for discovering some of the complex evolutionary processes that underly sequence evolution, and may be of interest to researchers who wish to identify how specific sites respond adaptively to particular environments. We implement the model within a Bayesian Markov-Chain Monte Carlo framework, and make it available in our BayesPhylogenies software (www.evolution.rdg.ac.uk).
view more