Fortran refactoring for legacy systems

Autores
Méndez, Mariano
Año de publicación
2011
Idioma
inglés
Tipo de recurso
tesis de maestría
Estado
versión aceptada
Colaborador/a o director/a de tesis
Tinetti, Fernando Gustavo
Garrido, Alejandra
Descripción
The motivation of this work comes from a Global Climate Model (GCM) Software which was in great need of being updated. This software was implemented by scientists in the ’80s as a result of meteorological research. Written in Fortran 77, this program has been used as an input to make climate predictions for the Southern Hemisphere. The execution to get a complete numerical data set takes several days. This software has been programmed using a sequential processing paradigm. In these days, where multicore processors are so widespread, the time that an execution takes to get a complete useful data set can be drastically reduced using this technology. As a first objective to reach this goal of reengineering we must be able to understand the source code. An essential Fortran code characteristic is that old source code versions became unreadable, not comprehensive and sometimes “ejects” the reader from the source code. In that way, we can not modify, update or improve unreadable source code. Then, as a first step to parallelize this code we must update it, turn it readable and easy to understand. The GCM has a very complex internal structure. The program is divided into about 300 .f (Fortran 77) files. These files generally implement only one Fortran subroutine. Less than 10% of the files are used for common blocks and constants. Approximately 25% of the lines in the source code are comments. The total number of Fortran source code lines is 58000. A detailed work within the source code brings to light that [74]: 1 About 230 routines are called/used at run time. Most of the runtime is spent in routines located at deep levels 5 to 7 in the dynamic call graph from the main routine. 2 The routine with most of the runtime (the top routine from now on) requires more than 9% of the total program runtime and is called about 315000 times. 3 The top 10 routines (the 10 routines at the top of the flat profile) require about 50% of total runtime. Two of them are related to intrinsic Fortran functions. Our first approach was using a scripting language and Find & Replace tools trying to upgrade the source code, this kind of code manipulation do not guarantee preservation of software behavior. Then, our goal was to develop an automated tool to transform legacy software in more understandable, comprehensible and readable applying refactoring as main technique. At the same time a catalog of transformation to be applied in Fortran code is needed as a guide to programmers through this process.
Es revisado por: http://sedici.unlp.edu.ar/handle/10915/9703
Magister en Ingeniería de Software
Universidad Nacional de La Plata
Facultad de Informática
Materia
Informática
Fortran; legacy systems; refactoring; Photran; software engineering
Nivel de accesibilidad
acceso abierto
Condiciones de uso
Licencia de distribución no exclusiva SEDICI
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/4201

id SEDICI_af0ce00f560ce1a33f7d28e415114c40
oai_identifier_str oai:sedici.unlp.edu.ar:10915/4201
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Fortran refactoring for legacy systemsMéndez, MarianoInformáticaFortran; legacy systems; refactoring; Photran; software engineeringThe motivation of this work comes from a Global Climate Model (GCM) Software which was in great need of being updated. This software was implemented by scientists in the ’80s as a result of meteorological research. Written in Fortran 77, this program has been used as an input to make climate predictions for the Southern Hemisphere. The execution to get a complete numerical data set takes several days. This software has been programmed using a sequential processing paradigm. In these days, where multicore processors are so widespread, the time that an execution takes to get a complete useful data set can be drastically reduced using this technology. As a first objective to reach this goal of reengineering we must be able to understand the source code. An essential Fortran code characteristic is that old source code versions became unreadable, not comprehensive and sometimes “ejects” the reader from the source code. In that way, we can not modify, update or improve unreadable source code. Then, as a first step to parallelize this code we must update it, turn it readable and easy to understand. The GCM has a very complex internal structure. The program is divided into about 300 .f (Fortran 77) files. These files generally implement only one Fortran subroutine. Less than 10% of the files are used for common blocks and constants. Approximately 25% of the lines in the source code are comments. The total number of Fortran source code lines is 58000. A detailed work within the source code brings to light that [74]: 1 About 230 routines are called/used at run time. Most of the runtime is spent in routines located at deep levels 5 to 7 in the dynamic call graph from the main routine. 2 The routine with most of the runtime (the top routine from now on) requires more than 9% of the total program runtime and is called about 315000 times. 3 The top 10 routines (the 10 routines at the top of the flat profile) require about 50% of total runtime. Two of them are related to intrinsic Fortran functions. Our first approach was using a scripting language and Find & Replace tools trying to upgrade the source code, this kind of code manipulation do not guarantee preservation of software behavior. Then, our goal was to develop an automated tool to transform legacy software in more understandable, comprehensible and readable applying refactoring as main technique. At the same time a catalog of transformation to be applied in Fortran code is needed as a guide to programmers through this process.Es revisado por: http://sedici.unlp.edu.ar/handle/10915/9703Magister en Ingeniería de SoftwareUniversidad Nacional de La PlataFacultad de InformáticaTinetti, Fernando GustavoGarrido, Alejandra2011-08-31info:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/acceptedVersionTesis de maestriahttp://purl.org/coar/resource_type/c_bdccinfo:ar-repo/semantics/tesisDeMaestriaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/4201https://doi.org/10.35537/10915/4201enginfo:eu-repo/semantics/openAccessLicencia de distribución no exclusiva SEDICIreponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:22:21Zoai:sedici.unlp.edu.ar:10915/4201Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:22:22.025SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Fortran refactoring for legacy systems
title Fortran refactoring for legacy systems
spellingShingle Fortran refactoring for legacy systems
Méndez, Mariano
Informática
Fortran; legacy systems; refactoring; Photran; software engineering
title_short Fortran refactoring for legacy systems
title_full Fortran refactoring for legacy systems
title_fullStr Fortran refactoring for legacy systems
title_full_unstemmed Fortran refactoring for legacy systems
title_sort Fortran refactoring for legacy systems
dc.creator.none.fl_str_mv Méndez, Mariano
author Méndez, Mariano
author_facet Méndez, Mariano
author_role author
dc.contributor.none.fl_str_mv Tinetti, Fernando Gustavo
Garrido, Alejandra
dc.subject.none.fl_str_mv Informática
Fortran; legacy systems; refactoring; Photran; software engineering
topic Informática
Fortran; legacy systems; refactoring; Photran; software engineering
dc.description.none.fl_txt_mv The motivation of this work comes from a Global Climate Model (GCM) Software which was in great need of being updated. This software was implemented by scientists in the ’80s as a result of meteorological research. Written in Fortran 77, this program has been used as an input to make climate predictions for the Southern Hemisphere. The execution to get a complete numerical data set takes several days. This software has been programmed using a sequential processing paradigm. In these days, where multicore processors are so widespread, the time that an execution takes to get a complete useful data set can be drastically reduced using this technology. As a first objective to reach this goal of reengineering we must be able to understand the source code. An essential Fortran code characteristic is that old source code versions became unreadable, not comprehensive and sometimes “ejects” the reader from the source code. In that way, we can not modify, update or improve unreadable source code. Then, as a first step to parallelize this code we must update it, turn it readable and easy to understand. The GCM has a very complex internal structure. The program is divided into about 300 .f (Fortran 77) files. These files generally implement only one Fortran subroutine. Less than 10% of the files are used for common blocks and constants. Approximately 25% of the lines in the source code are comments. The total number of Fortran source code lines is 58000. A detailed work within the source code brings to light that [74]: 1 About 230 routines are called/used at run time. Most of the runtime is spent in routines located at deep levels 5 to 7 in the dynamic call graph from the main routine. 2 The routine with most of the runtime (the top routine from now on) requires more than 9% of the total program runtime and is called about 315000 times. 3 The top 10 routines (the 10 routines at the top of the flat profile) require about 50% of total runtime. Two of them are related to intrinsic Fortran functions. Our first approach was using a scripting language and Find & Replace tools trying to upgrade the source code, this kind of code manipulation do not guarantee preservation of software behavior. Then, our goal was to develop an automated tool to transform legacy software in more understandable, comprehensible and readable applying refactoring as main technique. At the same time a catalog of transformation to be applied in Fortran code is needed as a guide to programmers through this process.
Es revisado por: http://sedici.unlp.edu.ar/handle/10915/9703
Magister en Ingeniería de Software
Universidad Nacional de La Plata
Facultad de Informática
description The motivation of this work comes from a Global Climate Model (GCM) Software which was in great need of being updated. This software was implemented by scientists in the ’80s as a result of meteorological research. Written in Fortran 77, this program has been used as an input to make climate predictions for the Southern Hemisphere. The execution to get a complete numerical data set takes several days. This software has been programmed using a sequential processing paradigm. In these days, where multicore processors are so widespread, the time that an execution takes to get a complete useful data set can be drastically reduced using this technology. As a first objective to reach this goal of reengineering we must be able to understand the source code. An essential Fortran code characteristic is that old source code versions became unreadable, not comprehensive and sometimes “ejects” the reader from the source code. In that way, we can not modify, update or improve unreadable source code. Then, as a first step to parallelize this code we must update it, turn it readable and easy to understand. The GCM has a very complex internal structure. The program is divided into about 300 .f (Fortran 77) files. These files generally implement only one Fortran subroutine. Less than 10% of the files are used for common blocks and constants. Approximately 25% of the lines in the source code are comments. The total number of Fortran source code lines is 58000. A detailed work within the source code brings to light that [74]: 1 About 230 routines are called/used at run time. Most of the runtime is spent in routines located at deep levels 5 to 7 in the dynamic call graph from the main routine. 2 The routine with most of the runtime (the top routine from now on) requires more than 9% of the total program runtime and is called about 315000 times. 3 The top 10 routines (the 10 routines at the top of the flat profile) require about 50% of total runtime. Two of them are related to intrinsic Fortran functions. Our first approach was using a scripting language and Find & Replace tools trying to upgrade the source code, this kind of code manipulation do not guarantee preservation of software behavior. Then, our goal was to develop an automated tool to transform legacy software in more understandable, comprehensible and readable applying refactoring as main technique. At the same time a catalog of transformation to be applied in Fortran code is needed as a guide to programmers through this process.
publishDate 2011
dc.date.none.fl_str_mv 2011-08-31
dc.type.none.fl_str_mv info:eu-repo/semantics/masterThesis
info:eu-repo/semantics/acceptedVersion
Tesis de maestria
http://purl.org/coar/resource_type/c_bdcc
info:ar-repo/semantics/tesisDeMaestria
format masterThesis
status_str acceptedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/4201
https://doi.org/10.35537/10915/4201
url http://sedici.unlp.edu.ar/handle/10915/4201
https://doi.org/10.35537/10915/4201
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
Licencia de distribución no exclusiva SEDICI
eu_rights_str_mv openAccess
rights_invalid_str_mv Licencia de distribución no exclusiva SEDICI
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260050603671552
score 13.13397