The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries
- Autores
- Prevosti, Francisco Juan; Chemisquy, Maria Amelia
- Año de publicación
- 2010
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Here we explore the effect of missing data in phylogenetic analyses using a large number of real morphological matrices. Different percentages and patterns of missing entries were added to each matrix, and their influence was evaluated by comparing the accuracy and error of most parsimonious trees. The relationships between accuracy and error and different parameters (e.g. the number of taxa and characters, homoplasy, support) were also evaluated. Our findings, based on real matrices, agree with the simulation studies, i.e. the negative effect increases with the percentage of missing entries, and decreases with the addition of more characters. This indicates that the main problem is the lack of information, not just the presence of missing data per se. Accuracy varies with different distribution patterns of missing entries; the worst case is when missing data are concentrated in a few taxa, while the best is when the missing entries are restricted to just a few characters. The results expand our knowledge of the missing data problem, corroborate many of the findings previously published using simulations, and could be useful for empirical or theoretical studies.
Fil: Prevosti, Francisco Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”; Argentina
Fil: Chemisquy, Maria Amelia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Botánica Darwinion. Academia Nacional de Ciencias Exactas, Físicas y Naturales. Instituto de Botánica Darwinion; Argentina - Materia
-
Missing Data
Morphological Phylogenies
Parsimony - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/69010
Ver los metadatos del registro completo
id |
CONICETDig_ed9b8cbe7109d62bc99bea2dc222bbc2 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/69010 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entriesPrevosti, Francisco JuanChemisquy, Maria AmeliaMissing DataMorphological PhylogeniesParsimonyhttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Here we explore the effect of missing data in phylogenetic analyses using a large number of real morphological matrices. Different percentages and patterns of missing entries were added to each matrix, and their influence was evaluated by comparing the accuracy and error of most parsimonious trees. The relationships between accuracy and error and different parameters (e.g. the number of taxa and characters, homoplasy, support) were also evaluated. Our findings, based on real matrices, agree with the simulation studies, i.e. the negative effect increases with the percentage of missing entries, and decreases with the addition of more characters. This indicates that the main problem is the lack of information, not just the presence of missing data per se. Accuracy varies with different distribution patterns of missing entries; the worst case is when missing data are concentrated in a few taxa, while the best is when the missing entries are restricted to just a few characters. The results expand our knowledge of the missing data problem, corroborate many of the findings previously published using simulations, and could be useful for empirical or theoretical studies.Fil: Prevosti, Francisco Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”; ArgentinaFil: Chemisquy, Maria Amelia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Botánica Darwinion. Academia Nacional de Ciencias Exactas, Físicas y Naturales. Instituto de Botánica Darwinion; ArgentinaWiley Blackwell Publishing, Inc2010-06info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/69010Prevosti, Francisco Juan; Chemisquy, Maria Amelia; The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries; Wiley Blackwell Publishing, Inc; Cladistics; 26; 3; 6-2010; 326-3390748-3007CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1111/j.1096-0031.2009.00289.xinfo:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1096-0031.2009.00289.xinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:57:50Zoai:ri.conicet.gov.ar:11336/69010instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:57:50.72CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
title |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
spellingShingle |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries Prevosti, Francisco Juan Missing Data Morphological Phylogenies Parsimony |
title_short |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
title_full |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
title_fullStr |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
title_full_unstemmed |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
title_sort |
The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries |
dc.creator.none.fl_str_mv |
Prevosti, Francisco Juan Chemisquy, Maria Amelia |
author |
Prevosti, Francisco Juan |
author_facet |
Prevosti, Francisco Juan Chemisquy, Maria Amelia |
author_role |
author |
author2 |
Chemisquy, Maria Amelia |
author2_role |
author |
dc.subject.none.fl_str_mv |
Missing Data Morphological Phylogenies Parsimony |
topic |
Missing Data Morphological Phylogenies Parsimony |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Here we explore the effect of missing data in phylogenetic analyses using a large number of real morphological matrices. Different percentages and patterns of missing entries were added to each matrix, and their influence was evaluated by comparing the accuracy and error of most parsimonious trees. The relationships between accuracy and error and different parameters (e.g. the number of taxa and characters, homoplasy, support) were also evaluated. Our findings, based on real matrices, agree with the simulation studies, i.e. the negative effect increases with the percentage of missing entries, and decreases with the addition of more characters. This indicates that the main problem is the lack of information, not just the presence of missing data per se. Accuracy varies with different distribution patterns of missing entries; the worst case is when missing data are concentrated in a few taxa, while the best is when the missing entries are restricted to just a few characters. The results expand our knowledge of the missing data problem, corroborate many of the findings previously published using simulations, and could be useful for empirical or theoretical studies. Fil: Prevosti, Francisco Juan. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”; Argentina Fil: Chemisquy, Maria Amelia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Botánica Darwinion. Academia Nacional de Ciencias Exactas, Físicas y Naturales. Instituto de Botánica Darwinion; Argentina |
description |
Here we explore the effect of missing data in phylogenetic analyses using a large number of real morphological matrices. Different percentages and patterns of missing entries were added to each matrix, and their influence was evaluated by comparing the accuracy and error of most parsimonious trees. The relationships between accuracy and error and different parameters (e.g. the number of taxa and characters, homoplasy, support) were also evaluated. Our findings, based on real matrices, agree with the simulation studies, i.e. the negative effect increases with the percentage of missing entries, and decreases with the addition of more characters. This indicates that the main problem is the lack of information, not just the presence of missing data per se. Accuracy varies with different distribution patterns of missing entries; the worst case is when missing data are concentrated in a few taxa, while the best is when the missing entries are restricted to just a few characters. The results expand our knowledge of the missing data problem, corroborate many of the findings previously published using simulations, and could be useful for empirical or theoretical studies. |
publishDate |
2010 |
dc.date.none.fl_str_mv |
2010-06 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/69010 Prevosti, Francisco Juan; Chemisquy, Maria Amelia; The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries; Wiley Blackwell Publishing, Inc; Cladistics; 26; 3; 6-2010; 326-339 0748-3007 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/69010 |
identifier_str_mv |
Prevosti, Francisco Juan; Chemisquy, Maria Amelia; The impact of missing data on real morphological phylogenies: Influence of the number and distribution of missing entries; Wiley Blackwell Publishing, Inc; Cladistics; 26; 3; 6-2010; 326-339 0748-3007 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1111/j.1096-0031.2009.00289.x info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1096-0031.2009.00289.x |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Wiley Blackwell Publishing, Inc |
publisher.none.fl_str_mv |
Wiley Blackwell Publishing, Inc |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613727656083456 |
score |
13.070432 |