Journal article
Phylo2Vec: a vector representation for binary trees
- Abstract:
-
Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with n leaves to a unique integer vector of length n − 1. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 7.3MB, Terms of use)
-
- Publisher copy:
- 10.1093/sysbio/syae030
Authors
- Publisher:
- Oxford University Press
- Journal:
- Systematic Biology More from this journal
- Volume:
- 74
- Issue:
- 2
- Pages:
- 250-266
- Publication date:
- 2024-06-27
- Acceptance date:
- 2024-06-19
- DOI:
- EISSN:
-
1076-836X
- ISSN:
-
1063-5157
- Language:
-
English
- Keywords:
- Pubs id:
-
2009297
- Local pid:
-
pubs:2009297
- Deposit date:
-
2024-06-21
Terms of use
- Copyright holder:
- Penn et al.
- Copyright date:
- 2024
- Rights statement:
- © The Author(s) 2024. Published by Oxford University Press on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record