Filtering Useless Data at the Source

Autores
Pessolani, Pablo Andrés; Quaglia, Constanza; Nou, Ramón
Año de publicación
2019
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
There are some processing environments where an application reads remote sequential files with a large number of records only to use some of them. Examples of those environments are servers, proxies, firewall and intrusion detection log analysis tools, sensor log analysis, large scientific datasets processing, etc. To be processed, all file records must be transferred through the network, and all of them must be processed by the application. Some of the transferred records would be discarded immediately by the application because it has no interest in them, but they just consumed network bandwidth and operating system’s cache buffers. This article proposes to filter records from the source of data but without changing the application. Those records of interest will be transferred without modifications but only references to the other records will be transferred from the source to the consuming application. At the application side, the sequence of records is rebuilt, keeping the content of records of interest and filling the others with dummy values which will be discarded by the application. As the number and length of records are preserved (and therefore the file size too), it is not necessary to modify the application. Once a filtering rule is applied to a file, only the useful records and references to unuseful ones will be transferred to the application side reducing network usage, transfer time, and cache utilization. A modified (but compatible) version of NFS protocol was developed as a proof of concept.
Red de Universidades con Carreras en Informática
Materia
Ciencias Informáticas
Logging
Network File System (NFS) protocol
NFS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/90693

id SEDICI_43b55968887322e4d57c87deb554d7b6
oai_identifier_str oai:sedici.unlp.edu.ar:10915/90693
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Filtering Useless Data at the SourcePessolani, Pablo AndrésQuaglia, ConstanzaNou, RamónCiencias InformáticasLoggingNetwork File System (NFS) protocolNFSThere are some processing environments where an application reads remote sequential files with a large number of records only to use some of them. Examples of those environments are servers, proxies, firewall and intrusion detection log analysis tools, sensor log analysis, large scientific datasets processing, etc. To be processed, all file records must be transferred through the network, and all of them must be processed by the application. Some of the transferred records would be discarded immediately by the application because it has no interest in them, but they just consumed network bandwidth and operating system’s cache buffers. This article proposes to filter records from the source of data but without changing the application. Those records of interest will be transferred without modifications but only references to the other records will be transferred from the source to the consuming application. At the application side, the sequence of records is rebuilt, keeping the content of records of interest and filling the others with dummy values which will be discarded by the application. As the number and length of records are preserved (and therefore the file size too), it is not necessary to modify the application. Once a filtering rule is applied to a file, only the useful records and references to unuseful ones will be transferred to the application side reducing network usage, transfer time, and cache utilization. A modified (but compatible) version of NFS protocol was developed as a proof of concept.Red de Universidades con Carreras en Informática2019-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf879-888http://sedici.unlp.edu.ar/handle/10915/90693enginfo:eu-repo/semantics/altIdentifier/isbn/978-987-688-377-1info:eu-repo/semantics/reference/hdl/10915/90359info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:18:38Zoai:sedici.unlp.edu.ar:10915/90693Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:18:38.739SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Filtering Useless Data at the Source
title Filtering Useless Data at the Source
spellingShingle Filtering Useless Data at the Source
Pessolani, Pablo Andrés
Ciencias Informáticas
Logging
Network File System (NFS) protocol
NFS
title_short Filtering Useless Data at the Source
title_full Filtering Useless Data at the Source
title_fullStr Filtering Useless Data at the Source
title_full_unstemmed Filtering Useless Data at the Source
title_sort Filtering Useless Data at the Source
dc.creator.none.fl_str_mv Pessolani, Pablo Andrés
Quaglia, Constanza
Nou, Ramón
author Pessolani, Pablo Andrés
author_facet Pessolani, Pablo Andrés
Quaglia, Constanza
Nou, Ramón
author_role author
author2 Quaglia, Constanza
Nou, Ramón
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Logging
Network File System (NFS) protocol
NFS
topic Ciencias Informáticas
Logging
Network File System (NFS) protocol
NFS
dc.description.none.fl_txt_mv There are some processing environments where an application reads remote sequential files with a large number of records only to use some of them. Examples of those environments are servers, proxies, firewall and intrusion detection log analysis tools, sensor log analysis, large scientific datasets processing, etc. To be processed, all file records must be transferred through the network, and all of them must be processed by the application. Some of the transferred records would be discarded immediately by the application because it has no interest in them, but they just consumed network bandwidth and operating system’s cache buffers. This article proposes to filter records from the source of data but without changing the application. Those records of interest will be transferred without modifications but only references to the other records will be transferred from the source to the consuming application. At the application side, the sequence of records is rebuilt, keeping the content of records of interest and filling the others with dummy values which will be discarded by the application. As the number and length of records are preserved (and therefore the file size too), it is not necessary to modify the application. Once a filtering rule is applied to a file, only the useful records and references to unuseful ones will be transferred to the application side reducing network usage, transfer time, and cache utilization. A modified (but compatible) version of NFS protocol was developed as a proof of concept.
Red de Universidades con Carreras en Informática
description There are some processing environments where an application reads remote sequential files with a large number of records only to use some of them. Examples of those environments are servers, proxies, firewall and intrusion detection log analysis tools, sensor log analysis, large scientific datasets processing, etc. To be processed, all file records must be transferred through the network, and all of them must be processed by the application. Some of the transferred records would be discarded immediately by the application because it has no interest in them, but they just consumed network bandwidth and operating system’s cache buffers. This article proposes to filter records from the source of data but without changing the application. Those records of interest will be transferred without modifications but only references to the other records will be transferred from the source to the consuming application. At the application side, the sequence of records is rebuilt, keeping the content of records of interest and filling the others with dummy values which will be discarded by the application. As the number and length of records are preserved (and therefore the file size too), it is not necessary to modify the application. Once a filtering rule is applied to a file, only the useful records and references to unuseful ones will be transferred to the application side reducing network usage, transfer time, and cache utilization. A modified (but compatible) version of NFS protocol was developed as a proof of concept.
publishDate 2019
dc.date.none.fl_str_mv 2019-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/90693
url http://sedici.unlp.edu.ar/handle/10915/90693
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-987-688-377-1
info:eu-repo/semantics/reference/hdl/10915/90359
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
879-888
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844616060184035328
score 13.070432