ddRAD-seq variant calling in peach and the effect of removing PCR duplicates

Autores
Ksouri, Najla; Benítez, M; Aballay, Maximiliano Martín; Sanchez, Gerardo; Contreras-Moreira, Bruno; Gogorcena Aoiz, Yolanda
Año de publicación
2022
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
X International Peach Symposium, Naoussa (Grecia), diciembre de 2022
Double digest RAD-seq (ddRAD-seq) is a flexible and cost-effective strategy that has emerged as one of the most popular genotyping approaches in plants. It relies on combining two restriction enzymes for library preparation followed by PCR amplification of the template molecules. However, PCR introduces sequence duplicates and may erroneously inflate the confidence of genotype calls at a particular site. Although the process of variant calling is relatively straightforward, it is time-consuming, involving multiple steps. Thus, removing any unneeded steps would reduce the computation time and simplify the analysis. Hence, the primary aim of this study is to evaluate the necessity of PCR duplicates and their effects on SNP and indel calling in peach. On the other hand, the accuracy of genetic variant identification in plants is a crucial step toward understanding phenotypical traits and monitoring breeding programs. However, false positive calls are a common issue that could hamper the detection of relevant variants. Thereby, a good combination of computational tools for alignment and variant calling is crucial to tackle these artifacts. In response to this challenge, three variant callers (BCFtools-mpileup, Freebayes and GATK-HaplotypeCaller) were combined on top of the BWA-mem read mapper. Variants derived from the intersection of these callers are selected as a high confidence set and flagged for subsequent analysis. The pipeline is documented and available as a set of Makefiles that can be adapted to any species. This work provides useful guidelines and a reproducible workflow for variant detection using ddRAD-seq data.
EEA San Pedro
Fil: Ksouri, Najla. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Benítez, M.M. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Contreras-Moreira, Bruno. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Gogorcena Aoiz, Yolanda. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fuente
Acta horticulturae 1352 : 405-412. (Dec. 2022)
Materia
Prunus persica
Variación Genética
Biotecnología Vegetal
Fitomejoramiento
Genotipado
Durazno
Genetic Variation
PCR
Plant Biotechnology
Plant Breeding
Genetic Techniques
Genotyping
Peaches
ddRAD-seq
Double digest RAD-seq
DNA-variants
Nivel de accesibilidad
acceso restringido
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
INTA Digital (INTA)
Institución
Instituto Nacional de Tecnología Agropecuaria
OAI Identificador
oai:localhost:20.500.12123/16484

id INTADig_ac12fb68efa9c8736fddc7888973415f
oai_identifier_str oai:localhost:20.500.12123/16484
network_acronym_str INTADig
repository_id_str l
network_name_str INTA Digital (INTA)
spelling ddRAD-seq variant calling in peach and the effect of removing PCR duplicatesKsouri, NajlaBenítez, MAballay, Maximiliano MartínSanchez, GerardoContreras-Moreira, BrunoGogorcena Aoiz, YolandaPrunus persicaVariación GenéticaBiotecnología VegetalFitomejoramientoGenotipadoDuraznoGenetic VariationPCRPlant BiotechnologyPlant BreedingGenetic TechniquesGenotypingPeachesddRAD-seqDouble digest RAD-seqDNA-variantsX International Peach Symposium, Naoussa (Grecia), diciembre de 2022Double digest RAD-seq (ddRAD-seq) is a flexible and cost-effective strategy that has emerged as one of the most popular genotyping approaches in plants. It relies on combining two restriction enzymes for library preparation followed by PCR amplification of the template molecules. However, PCR introduces sequence duplicates and may erroneously inflate the confidence of genotype calls at a particular site. Although the process of variant calling is relatively straightforward, it is time-consuming, involving multiple steps. Thus, removing any unneeded steps would reduce the computation time and simplify the analysis. Hence, the primary aim of this study is to evaluate the necessity of PCR duplicates and their effects on SNP and indel calling in peach. On the other hand, the accuracy of genetic variant identification in plants is a crucial step toward understanding phenotypical traits and monitoring breeding programs. However, false positive calls are a common issue that could hamper the detection of relevant variants. Thereby, a good combination of computational tools for alignment and variant calling is crucial to tackle these artifacts. In response to this challenge, three variant callers (BCFtools-mpileup, Freebayes and GATK-HaplotypeCaller) were combined on top of the BWA-mem read mapper. Variants derived from the intersection of these callers are selected as a high confidence set and flagged for subsequent analysis. The pipeline is documented and available as a set of Makefiles that can be adapted to any species. This work provides useful guidelines and a reproducible workflow for variant detection using ddRAD-seq data.EEA San PedroFil: Ksouri, Najla. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; EspañaFil: Benítez, M.M. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; EspañaFil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Contreras-Moreira, Bruno. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; EspañaFil: Gogorcena Aoiz, Yolanda. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; EspañaISHS2024-01-09T12:44:29Z2024-01-09T12:44:29Z2022info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://hdl.handle.net/20.500.12123/16484978-94-62613-52-22406-6168https://doi.org/10.17660/ActaHortic.2022.1352.56Acta horticulturae 1352 : 405-412. (Dec. 2022)reponame:INTA Digital (INTA)instname:Instituto Nacional de Tecnología Agropecuariaenginfo:eu-repo/semantics/restrictedAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)2025-10-16T09:31:02Zoai:localhost:20.500.12123/16484instacron:INTAInstitucionalhttp://repositorio.inta.gob.ar/Organismo científico-tecnológicoNo correspondehttp://repositorio.inta.gob.ar/oai/requesttripaldi.nicolas@inta.gob.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:l2025-10-16 09:31:03.132INTA Digital (INTA) - Instituto Nacional de Tecnología Agropecuariafalse
dc.title.none.fl_str_mv ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
title ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
spellingShingle ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
Ksouri, Najla
Prunus persica
Variación Genética
Biotecnología Vegetal
Fitomejoramiento
Genotipado
Durazno
Genetic Variation
PCR
Plant Biotechnology
Plant Breeding
Genetic Techniques
Genotyping
Peaches
ddRAD-seq
Double digest RAD-seq
DNA-variants
title_short ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
title_full ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
title_fullStr ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
title_full_unstemmed ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
title_sort ddRAD-seq variant calling in peach and the effect of removing PCR duplicates
dc.creator.none.fl_str_mv Ksouri, Najla
Benítez, M
Aballay, Maximiliano Martín
Sanchez, Gerardo
Contreras-Moreira, Bruno
Gogorcena Aoiz, Yolanda
author Ksouri, Najla
author_facet Ksouri, Najla
Benítez, M
Aballay, Maximiliano Martín
Sanchez, Gerardo
Contreras-Moreira, Bruno
Gogorcena Aoiz, Yolanda
author_role author
author2 Benítez, M
Aballay, Maximiliano Martín
Sanchez, Gerardo
Contreras-Moreira, Bruno
Gogorcena Aoiz, Yolanda
author2_role author
author
author
author
author
dc.subject.none.fl_str_mv Prunus persica
Variación Genética
Biotecnología Vegetal
Fitomejoramiento
Genotipado
Durazno
Genetic Variation
PCR
Plant Biotechnology
Plant Breeding
Genetic Techniques
Genotyping
Peaches
ddRAD-seq
Double digest RAD-seq
DNA-variants
topic Prunus persica
Variación Genética
Biotecnología Vegetal
Fitomejoramiento
Genotipado
Durazno
Genetic Variation
PCR
Plant Biotechnology
Plant Breeding
Genetic Techniques
Genotyping
Peaches
ddRAD-seq
Double digest RAD-seq
DNA-variants
dc.description.none.fl_txt_mv X International Peach Symposium, Naoussa (Grecia), diciembre de 2022
Double digest RAD-seq (ddRAD-seq) is a flexible and cost-effective strategy that has emerged as one of the most popular genotyping approaches in plants. It relies on combining two restriction enzymes for library preparation followed by PCR amplification of the template molecules. However, PCR introduces sequence duplicates and may erroneously inflate the confidence of genotype calls at a particular site. Although the process of variant calling is relatively straightforward, it is time-consuming, involving multiple steps. Thus, removing any unneeded steps would reduce the computation time and simplify the analysis. Hence, the primary aim of this study is to evaluate the necessity of PCR duplicates and their effects on SNP and indel calling in peach. On the other hand, the accuracy of genetic variant identification in plants is a crucial step toward understanding phenotypical traits and monitoring breeding programs. However, false positive calls are a common issue that could hamper the detection of relevant variants. Thereby, a good combination of computational tools for alignment and variant calling is crucial to tackle these artifacts. In response to this challenge, three variant callers (BCFtools-mpileup, Freebayes and GATK-HaplotypeCaller) were combined on top of the BWA-mem read mapper. Variants derived from the intersection of these callers are selected as a high confidence set and flagged for subsequent analysis. The pipeline is documented and available as a set of Makefiles that can be adapted to any species. This work provides useful guidelines and a reproducible workflow for variant detection using ddRAD-seq data.
EEA San Pedro
Fil: Ksouri, Najla. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Benítez, M.M. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Contreras-Moreira, Bruno. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
Fil: Gogorcena Aoiz, Yolanda. Consejo Superior de Investigaciones Científicas (CSIC). Estación Experimental Aula Dei; España
description X International Peach Symposium, Naoussa (Grecia), diciembre de 2022
publishDate 2022
dc.date.none.fl_str_mv 2022
2024-01-09T12:44:29Z
2024-01-09T12:44:29Z
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/20.500.12123/16484
978-94-62613-52-2
2406-6168
https://doi.org/10.17660/ActaHortic.2022.1352.56
url http://hdl.handle.net/20.500.12123/16484
https://doi.org/10.17660/ActaHortic.2022.1352.56
identifier_str_mv 978-94-62613-52-2
2406-6168
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/restrictedAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv restrictedAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv ISHS
publisher.none.fl_str_mv ISHS
dc.source.none.fl_str_mv Acta horticulturae 1352 : 405-412. (Dec. 2022)
reponame:INTA Digital (INTA)
instname:Instituto Nacional de Tecnología Agropecuaria
reponame_str INTA Digital (INTA)
collection INTA Digital (INTA)
instname_str Instituto Nacional de Tecnología Agropecuaria
repository.name.fl_str_mv INTA Digital (INTA) - Instituto Nacional de Tecnología Agropecuaria
repository.mail.fl_str_mv tripaldi.nicolas@inta.gob.ar
_version_ 1846143555561586688
score 12.712165