CURatio: Genome-Wide Phylogenomic Analysis Method Using Ratios of Total Branch Lengths

Qiwen Kang, Neil Moore, Christopher L. Schardl, Ruriko Yoshida

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Evolutionary hypotheses provide important underpinnings of biological and medical sciences, and comprehensive, genome-wide understanding of evolutionary relationships among organisms are needed to test and refine such hypotheses. Theory and empirical evidence clearly indicate that phylogenies (trees) of different genes (loci) should not display precisely matching topologies. The main reason for such phylogenetic incongruence is reticulated evolutionary history of most species due to meiotic sexual recombination in eukaryotes, or horizontal transfers of genetic material in prokaryotes. Nevertheless, many genes should display topologically related phylogenies, and should group into one or more (for genetic hybrids) clusters in poly-dimensional 'tree space'. Unusual evolutionary histories or effects of selection may result in 'outlier' genes with phylogenies that fall outside the main distribution(s) of trees in tree space. We present a new phylogenomic method, CURatio, which uses ratios of total branch lengths in gene trees to help identify phylogenetic outliers in a given set of ortholog groups from multiple genomes. An advantage of CURatio over other methods is that genes absent from and/or duplicated in some genomes can be included in the analysis. We conducted a simulation study under the coalescent model, and showed that, given sufficient species depth and topological difference, these ratios are significantly higher for the 'outlier' gene phylogenies. Also, we applied CURatio to a set of annotated genomes of the fungal family, Clavicipitaceae, and identified alkaloid biosynthesis genes as outliers, probably due to a history of duplication and loss. The source code is available at https://github.com/QiwenKang/CURatio, and the empirical data set for Clavicipitaceae and simulated data set are available at Mendeley https://data.mendeley.com/datasets/mrxts7wjrr/1.

Original languageEnglish
Article number8515078
Pages (from-to)981-989
Number of pages9
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume17
Issue number3
DOIs
StatePublished - May 1 2020

Bibliographical note

Publisher Copyright:
© 2004-2012 IEEE.

Keywords

  • Evolutionary models
  • gene trees
  • likelihood functions
  • outliers
  • phylogenomics
  • species trees

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'CURatio: Genome-Wide Phylogenomic Analysis Method Using Ratios of Total Branch Lengths'. Together they form a unique fingerprint.

Cite this