Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families
- Autores
- Perichinsky, Gregorio; Servente, Magdalena; Servetto, Arturo Carlos; García Martínez, Ramón; Orellana, Rosa Beatriz; Plastino, Ángel Luis
- Año de publicación
- 2003
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Numerical Taxonomy aims to group in clusters, using so-called structure analysis of operational taxonomic units (OTUs or taxons or taxa) through numerical methods. Clusters that consitute families was the purpose of this series of last projects. Structural analysis, based on their phenotypic characteristics, exhibits the relationships, in terms of degrees of similarity, between two or more OTUs. Entities formed by dynamic domains of attributes, change according to taxonomical requirements: Classification of objects to form families. Taxonomic objects are represented by semantics application of Dynamic Relational Database Model. Families of OTUs are obtained employing as tools i) the Euclidean distance and ii) nearest neighbor techniques. Thus taxonomic evidence is gathered so as to quantify the similarity for each pair of OTUs (pair-group method) obtained from the basic data matrix. The main contribution up until now is to introduce the concept of spectrum of the OTUs, based in the states of their characters. The concept of families’ spectra emerges, if the superposition principle is applied to the spectra of the OTUs, and the groups are delimited through the maximum of the Bienaymé-Tchebycheff relation, that determines Invariants (centroid, variance and radius). A new taxonomic criterion is thereby formulated. An astronomic application is worked out. The result is a new criterion for the classification of asteroids in the hyperspace of orbital proper elements. Thus, a new approach to Computational Taxonomy is presented, that has been already employed with reference to Data Mining. This paper analyses the application of Machine Learning techniques to Data Mining. We focused our interest on the TDIDT (Top Down Induction Trees) induction family from pre-classified data, and in particular to the ID3 and the C4.5 algorithms, created by Quinlan. We tried to determine the degree of efficiency achieved by the TDIDT family’s algorithms when applied in data mining to generate valid models of the data in classification problems with the Gain of Entropy. The Informatics (Data Mining and Computational Taxonomy), is always the original objective of our researches.
Eje: Bases de datos
Red de Universidades con Carreras en Informática (RedUNCI) - Materia
-
Ciencias Informáticas
classification
cluster (family)
spectrum
induction
divide and rule
entropy
base de datos
Algorithms
Data mining - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/21405
Ver los metadatos del registro completo
id |
SEDICI_6edfc4ba96eb7f2d7c86f26e46a689b2 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/21405 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids familiesPerichinsky, GregorioServente, MagdalenaServetto, Arturo CarlosGarcía Martínez, RamónOrellana, Rosa BeatrizPlastino, Ángel LuisCiencias Informáticasclassificationcluster (family)spectruminductiondivide and ruleentropybase de datosAlgorithmsData miningNumerical Taxonomy aims to group in clusters, using so-called structure analysis of operational taxonomic units (OTUs or taxons or taxa) through numerical methods. Clusters that consitute families was the purpose of this series of last projects. Structural analysis, based on their phenotypic characteristics, exhibits the relationships, in terms of degrees of similarity, between two or more OTUs. Entities formed by dynamic domains of attributes, change according to taxonomical requirements: Classification of objects to form families. Taxonomic objects are represented by semantics application of Dynamic Relational Database Model. Families of OTUs are obtained employing as tools i) the Euclidean distance and ii) nearest neighbor techniques. Thus taxonomic evidence is gathered so as to quantify the similarity for each pair of OTUs (pair-group method) obtained from the basic data matrix. The main contribution up until now is to introduce the concept of spectrum of the OTUs, based in the states of their characters. The concept of families’ spectra emerges, if the superposition principle is applied to the spectra of the OTUs, and the groups are delimited through the maximum of the Bienaymé-Tchebycheff relation, that determines Invariants (centroid, variance and radius). A new taxonomic criterion is thereby formulated. An astronomic application is worked out. The result is a new criterion for the classification of asteroids in the hyperspace of orbital proper elements. Thus, a new approach to Computational Taxonomy is presented, that has been already employed with reference to Data Mining. This paper analyses the application of Machine Learning techniques to Data Mining. We focused our interest on the TDIDT (Top Down Induction Trees) induction family from pre-classified data, and in particular to the ID3 and the C4.5 algorithms, created by Quinlan. We tried to determine the degree of efficiency achieved by the TDIDT family’s algorithms when applied in data mining to generate valid models of the data in classification problems with the Gain of Entropy. The Informatics (Data Mining and Computational Taxonomy), is always the original objective of our researches.Eje: Bases de datosRed de Universidades con Carreras en Informática (RedUNCI)2003-05info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf322-328http://sedici.unlp.edu.ar/handle/10915/21405enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T10:54:38Zoai:sedici.unlp.edu.ar:10915/21405Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 10:54:38.234SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
title |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
spellingShingle |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families Perichinsky, Gregorio Ciencias Informáticas classification cluster (family) spectrum induction divide and rule entropy base de datos Algorithms Data mining |
title_short |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
title_full |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
title_fullStr |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
title_full_unstemmed |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
title_sort |
Taxonomic evidence applying algorithms of intelligent data mining : Asteroids families |
dc.creator.none.fl_str_mv |
Perichinsky, Gregorio Servente, Magdalena Servetto, Arturo Carlos García Martínez, Ramón Orellana, Rosa Beatriz Plastino, Ángel Luis |
author |
Perichinsky, Gregorio |
author_facet |
Perichinsky, Gregorio Servente, Magdalena Servetto, Arturo Carlos García Martínez, Ramón Orellana, Rosa Beatriz Plastino, Ángel Luis |
author_role |
author |
author2 |
Servente, Magdalena Servetto, Arturo Carlos García Martínez, Ramón Orellana, Rosa Beatriz Plastino, Ángel Luis |
author2_role |
author author author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas classification cluster (family) spectrum induction divide and rule entropy base de datos Algorithms Data mining |
topic |
Ciencias Informáticas classification cluster (family) spectrum induction divide and rule entropy base de datos Algorithms Data mining |
dc.description.none.fl_txt_mv |
Numerical Taxonomy aims to group in clusters, using so-called structure analysis of operational taxonomic units (OTUs or taxons or taxa) through numerical methods. Clusters that consitute families was the purpose of this series of last projects. Structural analysis, based on their phenotypic characteristics, exhibits the relationships, in terms of degrees of similarity, between two or more OTUs. Entities formed by dynamic domains of attributes, change according to taxonomical requirements: Classification of objects to form families. Taxonomic objects are represented by semantics application of Dynamic Relational Database Model. Families of OTUs are obtained employing as tools i) the Euclidean distance and ii) nearest neighbor techniques. Thus taxonomic evidence is gathered so as to quantify the similarity for each pair of OTUs (pair-group method) obtained from the basic data matrix. The main contribution up until now is to introduce the concept of spectrum of the OTUs, based in the states of their characters. The concept of families’ spectra emerges, if the superposition principle is applied to the spectra of the OTUs, and the groups are delimited through the maximum of the Bienaymé-Tchebycheff relation, that determines Invariants (centroid, variance and radius). A new taxonomic criterion is thereby formulated. An astronomic application is worked out. The result is a new criterion for the classification of asteroids in the hyperspace of orbital proper elements. Thus, a new approach to Computational Taxonomy is presented, that has been already employed with reference to Data Mining. This paper analyses the application of Machine Learning techniques to Data Mining. We focused our interest on the TDIDT (Top Down Induction Trees) induction family from pre-classified data, and in particular to the ID3 and the C4.5 algorithms, created by Quinlan. We tried to determine the degree of efficiency achieved by the TDIDT family’s algorithms when applied in data mining to generate valid models of the data in classification problems with the Gain of Entropy. The Informatics (Data Mining and Computational Taxonomy), is always the original objective of our researches. Eje: Bases de datos Red de Universidades con Carreras en Informática (RedUNCI) |
description |
Numerical Taxonomy aims to group in clusters, using so-called structure analysis of operational taxonomic units (OTUs or taxons or taxa) through numerical methods. Clusters that consitute families was the purpose of this series of last projects. Structural analysis, based on their phenotypic characteristics, exhibits the relationships, in terms of degrees of similarity, between two or more OTUs. Entities formed by dynamic domains of attributes, change according to taxonomical requirements: Classification of objects to form families. Taxonomic objects are represented by semantics application of Dynamic Relational Database Model. Families of OTUs are obtained employing as tools i) the Euclidean distance and ii) nearest neighbor techniques. Thus taxonomic evidence is gathered so as to quantify the similarity for each pair of OTUs (pair-group method) obtained from the basic data matrix. The main contribution up until now is to introduce the concept of spectrum of the OTUs, based in the states of their characters. The concept of families’ spectra emerges, if the superposition principle is applied to the spectra of the OTUs, and the groups are delimited through the maximum of the Bienaymé-Tchebycheff relation, that determines Invariants (centroid, variance and radius). A new taxonomic criterion is thereby formulated. An astronomic application is worked out. The result is a new criterion for the classification of asteroids in the hyperspace of orbital proper elements. Thus, a new approach to Computational Taxonomy is presented, that has been already employed with reference to Data Mining. This paper analyses the application of Machine Learning techniques to Data Mining. We focused our interest on the TDIDT (Top Down Induction Trees) induction family from pre-classified data, and in particular to the ID3 and the C4.5 algorithms, created by Quinlan. We tried to determine the degree of efficiency achieved by the TDIDT family’s algorithms when applied in data mining to generate valid models of the data in classification problems with the Gain of Entropy. The Informatics (Data Mining and Computational Taxonomy), is always the original objective of our researches. |
publishDate |
2003 |
dc.date.none.fl_str_mv |
2003-05 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/21405 |
url |
http://sedici.unlp.edu.ar/handle/10915/21405 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
dc.format.none.fl_str_mv |
application/pdf 322-328 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1844615804130164736 |
score |
13.070432 |