Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses
- Autores
- Simmons, Mark P.; Goloboff, Pablo Augusto; Stöver, Ben C.; Springer, Mark S.; Gatesy, John
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson–Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily ordubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene treeswere excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets.
Fil: Simmons, Mark P.. State University of Colorado - Fort Collins; Estados Unidos
Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina
Fil: Stöver, Ben C.. Institute for Evolution and Biodiversity; Alemania
Fil: Springer, Mark S.. University of California; Estados Unidos
Fil: Gatesy, John. American Museum of Natural History; Estados Unidos - Materia
-
phylogeny
tree comparisons
coalescence
parsimony - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/238605
Ver los metadatos del registro completo
id |
CONICETDig_84f7d3c5e608a361ffa8a940b3083956 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/238605 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analysesSimmons, Mark P.Goloboff, Pablo AugustoStöver, Ben C.Springer, Mark S.Gatesy, Johnphylogenytree comparisonscoalescenceparsimonyhttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson–Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily ordubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene treeswere excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets.Fil: Simmons, Mark P.. State University of Colorado - Fort Collins; Estados UnidosFil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; ArgentinaFil: Stöver, Ben C.. Institute for Evolution and Biodiversity; AlemaniaFil: Springer, Mark S.. University of California; Estados UnidosFil: Gatesy, John. American Museum of Natural History; Estados UnidosWiley Blackwell Publishing, Inc2023-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/238605Simmons, Mark P.; Goloboff, Pablo Augusto; Stöver, Ben C.; Springer, Mark S.; Gatesy, John; Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses; Wiley Blackwell Publishing, Inc; Cladistics; 39; 5; 4-2023; 418-4360748-3007CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1111/cla.12540info:eu-repo/semantics/altIdentifier/doi/10.1111/cla.12540info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:52:49Zoai:ri.conicet.gov.ar:11336/238605instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:52:49.259CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
title |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
spellingShingle |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses Simmons, Mark P. phylogeny tree comparisons coalescence parsimony |
title_short |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
title_full |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
title_fullStr |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
title_full_unstemmed |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
title_sort |
Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses |
dc.creator.none.fl_str_mv |
Simmons, Mark P. Goloboff, Pablo Augusto Stöver, Ben C. Springer, Mark S. Gatesy, John |
author |
Simmons, Mark P. |
author_facet |
Simmons, Mark P. Goloboff, Pablo Augusto Stöver, Ben C. Springer, Mark S. Gatesy, John |
author_role |
author |
author2 |
Goloboff, Pablo Augusto Stöver, Ben C. Springer, Mark S. Gatesy, John |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
phylogeny tree comparisons coalescence parsimony |
topic |
phylogeny tree comparisons coalescence parsimony |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson–Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily ordubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene treeswere excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets. Fil: Simmons, Mark P.. State University of Colorado - Fort Collins; Estados Unidos Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina Fil: Stöver, Ben C.. Institute for Evolution and Biodiversity; Alemania Fil: Springer, Mark S.. University of California; Estados Unidos Fil: Gatesy, John. American Museum of Natural History; Estados Unidos |
description |
Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson–Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily ordubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene treeswere excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-04 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/238605 Simmons, Mark P.; Goloboff, Pablo Augusto; Stöver, Ben C.; Springer, Mark S.; Gatesy, John; Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses; Wiley Blackwell Publishing, Inc; Cladistics; 39; 5; 4-2023; 418-436 0748-3007 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/238605 |
identifier_str_mv |
Simmons, Mark P.; Goloboff, Pablo Augusto; Stöver, Ben C.; Springer, Mark S.; Gatesy, John; Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses; Wiley Blackwell Publishing, Inc; Cladistics; 39; 5; 4-2023; 418-436 0748-3007 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1111/cla.12540 info:eu-repo/semantics/altIdentifier/doi/10.1111/cla.12540 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Wiley Blackwell Publishing, Inc |
publisher.none.fl_str_mv |
Wiley Blackwell Publishing, Inc |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1846083057023451136 |
score |
13.22299 |