Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses,...

Autores
Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier
Año de publicación
2024
Idioma
inglés
Tipo de recurso
conjunto de datos
Estado
Descripción
We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.
Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/247231

id CONICETDig_ff4369dae9a7ce9fb0696af3a144ea7b
oai_identifier_str oai:ri.conicet.gov.ar:11336/247231
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic treesBarone, Mariana LuciaWilson, Jeremy DeanRamirez, Martin Javierhttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; ArgentinaFil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; ArgentinaFil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina2024info:ar-repo/semantics/conjuntoDeDatosv1.0info:eu-repo/semantics/dataSetapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamhttp://hdl.handle.net/11336/247231Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231CONICET DigitalCONICETenginfo:eu-repo/grantAgreement/Ministerio de Ciencia. Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica/2017-2689info:eu-repo/grantAgreement/Ministerio de Ciencia, Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica. Fondo para la Investigación Científica y Tecnológica/2017-2689info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:40:01Zoai:ri.conicet.gov.ar:11336/247231instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:40:02.115CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
spellingShingle Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
Barone, Mariana Lucia
title_short Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_full Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_fullStr Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_full_unstemmed Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_sort Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
dc.creator.none.fl_str_mv Barone, Mariana Lucia
Wilson, Jeremy Dean
Ramirez, Martin Javier
author Barone, Mariana Lucia
author_facet Barone, Mariana Lucia
Wilson, Jeremy Dean
Ramirez, Martin Javier
author_role author
author2 Wilson, Jeremy Dean
Ramirez, Martin Javier
author2_role author
author
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.6
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.
Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
description We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.
publishDate 2024
dc.date.none.fl_str_mv 2024
dc.type.none.fl_str_mv info:ar-repo/semantics/conjuntoDeDatos
v1.0
info:eu-repo/semantics/dataSet
format dataSet
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/247231
Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231
CONICET Digital
CONICET
url http://hdl.handle.net/11336/247231
identifier_str_mv Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/grantAgreement/Ministerio de Ciencia. Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica/2017-2689
info:eu-repo/grantAgreement/Ministerio de Ciencia, Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica. Fondo para la Investigación Científica y Tecnológica/2017-2689
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
application/octet-stream
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614427216707584
score 13.070432