On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
- Autores
- Goloboff, Pablo Augusto; Pol, Diego
- Año de publicación
- 2007
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.
Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina
Fil: Pol, Diego. Museo Paleontológico Egidio Feruglio; Argentina - Materia
-
Phylogeny
Algorithms
Cladistics
Tree Searches - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/82978
Ver los metadatos del registro completo
id |
CONICETDig_ea5eae2fb8104c3077652872a21a7d47 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/82978 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNTGoloboff, Pablo AugustoPol, DiegoPhylogenyAlgorithmsCladisticsTree Searcheshttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; ArgentinaFil: Pol, Diego. Museo Paleontológico Egidio Feruglio; ArgentinaOxford University Press2007-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/82978Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-4951063-5157CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1080/10635150701431905info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/sysbio/article-pdf/56/3/485/24203534/56-3-485.pdfinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:50:49Zoai:ri.conicet.gov.ar:11336/82978instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:50:49.48CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
title |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
spellingShingle |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT Goloboff, Pablo Augusto Phylogeny Algorithms Cladistics Tree Searches |
title_short |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
title_full |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
title_fullStr |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
title_full_unstemmed |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
title_sort |
On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT |
dc.creator.none.fl_str_mv |
Goloboff, Pablo Augusto Pol, Diego |
author |
Goloboff, Pablo Augusto |
author_facet |
Goloboff, Pablo Augusto Pol, Diego |
author_role |
author |
author2 |
Pol, Diego |
author2_role |
author |
dc.subject.none.fl_str_mv |
Phylogeny Algorithms Cladistics Tree Searches |
topic |
Phylogeny Algorithms Cladistics Tree Searches |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy. Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina Fil: Pol, Diego. Museo Paleontológico Egidio Feruglio; Argentina |
description |
Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy. |
publishDate |
2007 |
dc.date.none.fl_str_mv |
2007-12 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/82978 Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-495 1063-5157 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/82978 |
identifier_str_mv |
Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-495 1063-5157 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1080/10635150701431905 info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/sysbio/article-pdf/56/3/485/24203534/56-3-485.pdf |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Oxford University Press |
publisher.none.fl_str_mv |
Oxford University Press |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269055351783424 |
score |
13.13397 |