Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses,...

Autores: Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier
Año de publicación: 2024
Idioma: inglés
Tipo de recurso: conjunto de datos
Estado
Descripción: We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.
Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Fil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/247231

Acceder

id	CONICETDig_ff4369dae9a7ce9fb0696af3a144ea7b
oai_identifier_str	oai:ri.conicet.gov.ar:11336/247231
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic treesBarone, Mariana LuciaWilson, Jeremy DeanRamirez, Martin Javierhttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; ArgentinaFil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; ArgentinaFil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina2024info:ar-repo/semantics/conjuntoDeDatosv1.0info:eu-repo/semantics/dataSetapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamapplication/octet-streamhttp://hdl.handle.net/11336/247231Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231CONICET DigitalCONICETenginfo:eu-repo/grantAgreement/Ministerio de Ciencia. Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica/2017-2689info:eu-repo/grantAgreement/Ministerio de Ciencia, Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica. Fondo para la Investigación Científica y Tecnológica/2017-2689info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:33:04Zoai:ri.conicet.gov.ar:11336/247231instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:33:04.738CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
spellingShingle	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees Barone, Mariana Lucia
title_short	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_full	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_fullStr	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_full_unstemmed	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
title_sort	Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees
dc.creator.none.fl_str_mv	Barone, Mariana Lucia Wilson, Jeremy Dean Ramirez, Martin Javier
author	Barone, Mariana Lucia
author_facet	Barone, Mariana Lucia Wilson, Jeremy Dean Ramirez, Martin Javier
author_role	author
author2	Wilson, Jeremy Dean Ramirez, Martin Javier
author2_role	author author
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping. Fil: Barone, Mariana Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina Fil: Wilson, Jeremy Dean. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina Fil: Ramirez, Martin Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"; Argentina
description	We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.
publishDate	2024
dc.date.none.fl_str_mv	2024
dc.type.none.fl_str_mv	info:ar-repo/semantics/conjuntoDeDatos v1.0 info:eu-repo/semantics/dataSet
format	dataSet
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/247231 Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231 CONICET Digital CONICET
url	http://hdl.handle.net/11336/247231
identifier_str_mv	Barone, Mariana Lucia; Wilson, Jeremy Dean; Ramirez, Martin Javier; (2024): Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees. Consejo Nacional de Investigaciones Científicas y Técnicas. (dataset). http://hdl.handle.net/11336/247231 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/grantAgreement/Ministerio de Ciencia. Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica/2017-2689 info:eu-repo/grantAgreement/Ministerio de Ciencia, Tecnología e Innovación Productiva. Agencia Nacional de Promoción Científica y Tecnológica. Fondo para la Investigación Científica y Tecnológica/2017-2689
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv	application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream application/octet-stream
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858306174738235392
score	12.665996

Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses,...

Publicaciones similares