On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT

Autores: Goloboff, Pablo Augusto; Pol, Diego
Año de publicación: 2007
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.
Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina
Fil: Pol, Diego. Museo Paleontológico Egidio Feruglio; Argentina
Materia: Phylogeny
Algorithms
Cladistics
Tree Searches
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/82978

Acceder

id	CONICETDig_ea5eae2fb8104c3077652872a21a7d47
oai_identifier_str	oai:ri.conicet.gov.ar:11336/82978
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNTGoloboff, Pablo AugustoPol, DiegoPhylogenyAlgorithmsCladisticsTree Searcheshttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; ArgentinaFil: Pol, Diego. Museo Paleontológico Egidio Feruglio; ArgentinaOxford University Press2007-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/82978Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-4951063-5157CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1080/10635150701431905info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/sysbio/article-pdf/56/3/485/24203534/56-3-485.pdfinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:05:11Zoai:ri.conicet.gov.ar:11336/82978instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:05:11.412CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
title	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
spellingShingle	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT Goloboff, Pablo Augusto Phylogeny Algorithms Cladistics Tree Searches
title_short	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
title_full	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
title_fullStr	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
title_full_unstemmed	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
title_sort	On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT
dc.creator.none.fl_str_mv	Goloboff, Pablo Augusto Pol, Diego
author	Goloboff, Pablo Augusto
author_facet	Goloboff, Pablo Augusto Pol, Diego
author_role	author
author2	Pol, Diego
author2_role	author
dc.subject.none.fl_str_mv	Phylogeny Algorithms Cladistics Tree Searches
topic	Phylogeny Algorithms Cladistics Tree Searches
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy. Fil: Goloboff, Pablo Augusto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Tucumán. Unidad Ejecutora Lillo; Argentina Fil: Pol, Diego. Museo Paleontológico Egidio Feruglio; Argentina
description	Roshan et al. recently described a ”divide-and-conquer” technique for parsimony analysis of large datasets, Rec-I-DCM3, and stated that it compares very favorably to results using the program TNT. Their technique is based on selecting subsets of taxa to create reduced datasets or subproblems, finding most-parsimonious trees for each reduced data set, recombining all parts together, and then performing global TBR swapping on the combined tree. Here, we contrast this approach to sectorial searches, a divide-and-conquer algorithm implemented in TNT. This algorithm also uses a guide tree to create subproblems, with the first-pass state sets of the nodes that join the selected sectors with the rest of the topology; this allows exact length calculations for the entire topology (that is, any solution N steps shorter than the original, for the reduced subproblem, must also be N steps shorter for the entire topology). We show here that, for sectors of similar size analyzed with the same search algorithms, subdividing datasets with sectorial searches produces better results than subdividing with Rec-I-DCM3. Roshan et al.’s claim that Rec-I-DCM3 outperforms thetechniques in TNT was caused by a poor experimental design and algorithmic settings used for the runs in TNT. In particular, for finding trees at or very close to the minimum known length of the analyzed datasets, TNT clearly outperforms Rec-I-DCM3. Finally, we show that the performance of Rec-I-DCM3 is bound by the efficiency of TBR implementation for the complete dataset, as this method behaves (after some number of iterations) as a technique for cyclic perturbations and improvements more than as a divide-and-conquer strategy.
publishDate	2007
dc.date.none.fl_str_mv	2007-12
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/82978 Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-495 1063-5157 CONICET Digital CONICET
url	http://hdl.handle.net/11336/82978
identifier_str_mv	Goloboff, Pablo Augusto; Pol, Diego; On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT; Oxford University Press; Systematic Biology; 56; 3; 12-2007; 485-495 1063-5157 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.1080/10635150701431905 info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/sysbio/article-pdf/56/3/485/24203534/56-3-485.pdf
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Oxford University Press
publisher.none.fl_str_mv	Oxford University Press
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858305152488833024
score	13.176822

On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-dcm3 vs. TNT

Publicaciones similares