A novel clustering approach for biological data using a new distance based on Gene Ontology
- Autores
- Leale, Guillermo; Milone, Diego H.; Bayá, Ariel E.; Granitto, Pablo Miguel; Stegmayer, Georgina
- Año de publicación
- 2013
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- When applying clustering algorithms on biological data the information about biological processes is not usually present in an explicit way, although this knowledge is later used by biologists to validate the clusters and the relations found among data. This work presents a new distance measure for biological data which combines expression and semantic information, in order to be used into a clustering algorithm. The distance is calculated pairwise among all pairs of genes and it is incorporated during the training process of the clustering algorithm. The approach was evaluated on two real datasets using several validation measures. The obtained results are consistent across all the measures, showing better semantic quality for clusters with the new algorithm in comparison to standard clustering.
Sociedad Argentina de Informática e Investigación Operativa - Materia
-
Ciencias Informáticas
biological data
clustering algorithm
measures - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/76213
Ver los metadatos del registro completo
id |
SEDICI_2b7aa2ffda30c813ed9067fb2a3d29e3 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/76213 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
A novel clustering approach for biological data using a new distance based on Gene OntologyLeale, GuillermoMilone, Diego H.Bayá, Ariel E.Granitto, Pablo MiguelStegmayer, GeorginaCiencias Informáticasbiological dataclustering algorithmmeasuresWhen applying clustering algorithms on biological data the information about biological processes is not usually present in an explicit way, although this knowledge is later used by biologists to validate the clusters and the relations found among data. This work presents a new distance measure for biological data which combines expression and semantic information, in order to be used into a clustering algorithm. The distance is calculated pairwise among all pairs of genes and it is incorporated during the training process of the clustering algorithm. The approach was evaluated on two real datasets using several validation measures. The obtained results are consistent across all the measures, showing better semantic quality for clusters with the new algorithm in comparison to standard clustering.Sociedad Argentina de Informática e Investigación Operativa2013-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf85-96http://sedici.unlp.edu.ar/handle/10915/76213enginfo:eu-repo/semantics/altIdentifier/issn/1850-2784info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/4.0/Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:13:24Zoai:sedici.unlp.edu.ar:10915/76213Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:13:24.7SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
title |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
spellingShingle |
A novel clustering approach for biological data using a new distance based on Gene Ontology Leale, Guillermo Ciencias Informáticas biological data clustering algorithm measures |
title_short |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
title_full |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
title_fullStr |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
title_full_unstemmed |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
title_sort |
A novel clustering approach for biological data using a new distance based on Gene Ontology |
dc.creator.none.fl_str_mv |
Leale, Guillermo Milone, Diego H. Bayá, Ariel E. Granitto, Pablo Miguel Stegmayer, Georgina |
author |
Leale, Guillermo |
author_facet |
Leale, Guillermo Milone, Diego H. Bayá, Ariel E. Granitto, Pablo Miguel Stegmayer, Georgina |
author_role |
author |
author2 |
Milone, Diego H. Bayá, Ariel E. Granitto, Pablo Miguel Stegmayer, Georgina |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas biological data clustering algorithm measures |
topic |
Ciencias Informáticas biological data clustering algorithm measures |
dc.description.none.fl_txt_mv |
When applying clustering algorithms on biological data the information about biological processes is not usually present in an explicit way, although this knowledge is later used by biologists to validate the clusters and the relations found among data. This work presents a new distance measure for biological data which combines expression and semantic information, in order to be used into a clustering algorithm. The distance is calculated pairwise among all pairs of genes and it is incorporated during the training process of the clustering algorithm. The approach was evaluated on two real datasets using several validation measures. The obtained results are consistent across all the measures, showing better semantic quality for clusters with the new algorithm in comparison to standard clustering. Sociedad Argentina de Informática e Investigación Operativa |
description |
When applying clustering algorithms on biological data the information about biological processes is not usually present in an explicit way, although this knowledge is later used by biologists to validate the clusters and the relations found among data. This work presents a new distance measure for biological data which combines expression and semantic information, in order to be used into a clustering algorithm. The distance is calculated pairwise among all pairs of genes and it is incorporated during the training process of the clustering algorithm. The approach was evaluated on two real datasets using several validation measures. The obtained results are consistent across all the measures, showing better semantic quality for clusters with the new algorithm in comparison to standard clustering. |
publishDate |
2013 |
dc.date.none.fl_str_mv |
2013-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/76213 |
url |
http://sedici.unlp.edu.ar/handle/10915/76213 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/issn/1850-2784 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 85-96 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1844616004676616192 |
score |
13.070432 |