Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection

Autores: Catania, Carlos Adrián; García Garino, Carlos; Bromberg, Facundo
Año de publicación: 2010
Idioma: inglés
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%.
Sociedad Argentina de Informática e Investigación Operativa
Materia: Ciencias Informáticas
Intrusion Detection Systems
Semi-supervised Learning
Expectation Maximization
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/152809

Acceder

id	SEDICI_08a0906716effd51dbcba2fdcc54d55a
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/152809
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion DetectionCatania, Carlos AdriánGarcía Garino, CarlosBromberg, FacundoCiencias InformáticasIntrusion Detection SystemsSemi-supervised LearningExpectation MaximizationSupervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%.Sociedad Argentina de Informática e Investigación Operativa2010info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf175-186http://sedici.unlp.edu.ar/handle/10915/152809enginfo:eu-repo/semantics/altIdentifier/url/http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdfinfo:eu-repo/semantics/altIdentifier/issn/1850-2784info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-27T11:33:31Zoai:sedici.unlp.edu.ar:10915/152809Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-27 11:33:32.175SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
spellingShingle	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection Catania, Carlos Adrián Ciencias Informáticas Intrusion Detection Systems Semi-supervised Learning Expectation Maximization
title_short	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_full	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_fullStr	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_full_unstemmed	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_sort	Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
dc.creator.none.fl_str_mv	Catania, Carlos Adrián García Garino, Carlos Bromberg, Facundo
author	Catania, Carlos Adrián
author_facet	Catania, Carlos Adrián García Garino, Carlos Bromberg, Facundo
author_role	author
author2	García Garino, Carlos Bromberg, Facundo
author2_role	author author
dc.subject.none.fl_str_mv	Ciencias Informáticas Intrusion Detection Systems Semi-supervised Learning Expectation Maximization
topic	Ciencias Informáticas Intrusion Detection Systems Semi-supervised Learning Expectation Maximization
dc.description.none.fl_txt_mv	Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%. Sociedad Argentina de Informática e Investigación Operativa
description	Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%.
publishDate	2010
dc.date.none.fl_str_mv	2010
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/152809
url	http://sedici.unlp.edu.ar/handle/10915/152809
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdf info:eu-repo/semantics/altIdentifier/issn/1850-2784
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv	application/pdf 175-186
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1866371983431696384
score	13.343132

Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection

Publicaciones similares