Hostname: page-component-78c5997874-lj6df Total loading time: 0 Render date: 2024-11-08T08:34:42.035Z Has data issue: false hasContentIssue false

The empirical discovery of phylogenetic invariants

Published online by Cambridge University Press:  01 July 2016

V. Ferretti*
Affiliation:
Université de Montréal
D. Sankoff*
Affiliation:
Université de Montréal
*
Postal address for both authors: Centre de recherches mathématiques, Université de Montréal, C.P.6128 Succursale “A”, Montréal, Canada H3C 3J7.
Postal address for both authors: Centre de recherches mathématiques, Université de Montréal, C.P.6128 Succursale “A”, Montréal, Canada H3C 3J7.

Abstract

An invariant Φ of a tree T under a k-state Markov model, where the time parameter is identified with the edges of T, allows us to recognize whether data on N observed species can be associated with the N terminal vertices of T in the sense of having been generated on T rather than on any other tree with N terminals. The invariance is with respect to the (time) lengths associated with the edges of the tree. We propose a general method of finding invariants of a parametrized functional form. It involves calculating the probability f of all kN data possibilities for each of m edge-length configurations of T, then solving for the parameters using the m equations of form Φ (f) = 0. We apply this to the case of quadratic invariants for unrooted binary trees with four terminals, for all k, using the Jukes–Cantor type of Markov matrix. We report partial results on finding the smallest algebraically independent set of invariants.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1993 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Buneman, P. (1974) A note on the metric property of trees. J. Combinatorial Theory B 17, 4850.CrossRefGoogle Scholar
Cavender, J. A. (1989) Mechanized derivation of linear invariants. Mol. Biol. Evol. 6, 301316.Google Scholar
Cavender, J. A. and Felsenstein, J. (1987) Invariants of phylogenies: Simple case with discrete states. J. Classification 4, 5771.CrossRefGoogle Scholar
Dobson, A. J. (1974) Unrooted trees for numerical taxonomy. J. Appl. Prob. 11, 3242.Google Scholar
Drolet, S. and Sankoff, D. (1990) Quadratic tree invariants for multivalued characters. J. Theoret. Biol. 144, 117129.CrossRefGoogle Scholar
Evans, S. N. and Speed, T. P. (in press) Invariants of some probability models used in phylogenetic inference. Ann. of Statist. To appear.Google Scholar
Felsenstein, J. (1983) Inferring evolutionary trees from DNA sequences. In Statistical Analysis of DNA Sequences , ed. Weir, B. S., Marcel Dekker, New York, pp. 133150.Google Scholar
Felsenstein, J. (1991) Counting phylogenetic invariants in some simple cases. J. Theoret. Biol. 152, 357376.CrossRefGoogle ScholarPubMed
Fu, Y. X. and Li, W. H. (1992a) Necessary and sufficient conditions for the existence of certain quadratic invariants under a phylogenetic tree. Math. Biosci. 108, 203218.CrossRefGoogle Scholar
Fu, Y. X. and Li, W. H. (1992b) Construction of linear invariants in phylogenetic inference. Math. Biosci. 109, 201228.Google Scholar
Jukes, T. H. and Cantor, C. R. (1969) Evolution of protein molecules. In Mammalian Protein Metabolism , ed. Munro, H. N., pp. 21132. Academic Press, New York.Google Scholar
Kimura, M. (1980) A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111120.Google Scholar
Lake, J. A. (1987) A rate-independent technique for analysis of nucleic acid sequences: Evolutionary parsimony. Mol. Biol. Evol. 4, 167191.Google Scholar
Lake, J. A. (1988) Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences. Nature 331, 184186.Google Scholar
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979) Multivariate Analysis. Academic Press, London.Google Scholar
Sankoff, D. (1990) Designer invariants for large phylogenies. Mol. Biol. Evol. 7, 255269.Google ScholarPubMed
Steel, M. A., Hendy, M. D. Székely, L. A. and Erdos, P. L. (1992) Spectral analysis and a closest tree method for genetic sequences. Appl. Math. Lett. 5, 6367.CrossRefGoogle Scholar
Székely, L. A., Steel, M. A. and Erdos, P. L. Fourier calculus on evolutionary trees. Adv. Appl. Math. To appear.Google Scholar