Using combination of actions in reinforcement learning

Autores: Karanik, Marcelo J.; Gramajo, Sergio D.
Año de publicación: 2010
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Software agents are programs that can observe their environment and act in an attempt to reach their design goals. In most cases the selection of particular agent architecture determines the behaviour in response to the different problem states However, there are some problem domains in which it is desirable that the agent learns a good action execution policy by interacting with its environment. This kind of learning is called Reinforcement Learning and it is useful in the process control area. Given a problem state, the agent selects the adequate action to do and receives an immediate reward, then estimations about every action are updated and, after a certain period of time, the agent learns which the best action to be executed is. Most reinforcement learning algorithms perform simple actions while two or more are capable of being used. This work involves the use of RL algorithms to find an optimal policy in a gridworld problem and proposes a mechanism to combine actions of different types.
Facultad de Informática
Materia: Ciencias Informáticas
Learning
SARSA
optimal policy
action combination
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc/3.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/9663

Acceder

id	SEDICI_666df05b8d1b49f697af09d589b78cf9
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/9663
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Using combination of actions in reinforcement learningKaranik, Marcelo J.Gramajo, Sergio D.Ciencias InformáticasLearningSARSAoptimal policyaction combinationSoftware agents are programs that can observe their environment and act in an attempt to reach their design goals. In most cases the selection of particular agent architecture determines the behaviour in response to the different problem states However, there are some problem domains in which it is desirable that the agent learns a good action execution policy by interacting with its environment. This kind of learning is called Reinforcement Learning and it is useful in the process control area. Given a problem state, the agent selects the adequate action to do and receives an immediate reward, then estimations about every action are updated and, after a certain period of time, the agent learns which the best action to be executed is. Most reinforcement learning algorithms perform simple actions while two or more are capable of being used. This work involves the use of RL algorithms to find an optimal policy in a gridworld problem and proposes a mechanism to combine actions of different types.Facultad de Informática2010-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf19-23http://sedici.unlp.edu.ar/handle/10915/9663enginfo:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/JCST-Apr10-4.pdfinfo:eu-repo/semantics/altIdentifier/issn/1666-6038info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc/3.0/Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-11-26T09:29:20Zoai:sedici.unlp.edu.ar:10915/9663Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-11-26 09:29:21.183SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Using combination of actions in reinforcement learning
title	Using combination of actions in reinforcement learning
spellingShingle	Using combination of actions in reinforcement learning Karanik, Marcelo J. Ciencias Informáticas Learning SARSA optimal policy action combination
title_short	Using combination of actions in reinforcement learning
title_full	Using combination of actions in reinforcement learning
title_fullStr	Using combination of actions in reinforcement learning
title_full_unstemmed	Using combination of actions in reinforcement learning
title_sort	Using combination of actions in reinforcement learning
dc.creator.none.fl_str_mv	Karanik, Marcelo J. Gramajo, Sergio D.
author	Karanik, Marcelo J.
author_facet	Karanik, Marcelo J. Gramajo, Sergio D.
author_role	author
author2	Gramajo, Sergio D.
author2_role	author
dc.subject.none.fl_str_mv	Ciencias Informáticas Learning SARSA optimal policy action combination
topic	Ciencias Informáticas Learning SARSA optimal policy action combination
dc.description.none.fl_txt_mv	Software agents are programs that can observe their environment and act in an attempt to reach their design goals. In most cases the selection of particular agent architecture determines the behaviour in response to the different problem states However, there are some problem domains in which it is desirable that the agent learns a good action execution policy by interacting with its environment. This kind of learning is called Reinforcement Learning and it is useful in the process control area. Given a problem state, the agent selects the adequate action to do and receives an immediate reward, then estimations about every action are updated and, after a certain period of time, the agent learns which the best action to be executed is. Most reinforcement learning algorithms perform simple actions while two or more are capable of being used. This work involves the use of RL algorithms to find an optimal policy in a gridworld problem and proposes a mechanism to combine actions of different types. Facultad de Informática
description	Software agents are programs that can observe their environment and act in an attempt to reach their design goals. In most cases the selection of particular agent architecture determines the behaviour in response to the different problem states However, there are some problem domains in which it is desirable that the agent learns a good action execution policy by interacting with its environment. This kind of learning is called Reinforcement Learning and it is useful in the process control area. Given a problem state, the agent selects the adequate action to do and receives an immediate reward, then estimations about every action are updated and, after a certain period of time, the agent learns which the best action to be executed is. Most reinforcement learning algorithms perform simple actions while two or more are capable of being used. This work involves the use of RL algorithms to find an optimal policy in a gridworld problem and proposes a mechanism to combine actions of different types.
publishDate	2010
dc.date.none.fl_str_mv	2010-04
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/9663
url	http://sedici.unlp.edu.ar/handle/10915/9663
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/JCST-Apr10-4.pdf info:eu-repo/semantics/altIdentifier/issn/1666-6038
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc/3.0/ Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc/3.0/ Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
dc.format.none.fl_str_mv	application/pdf 19-23
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1849875643940995072
score	13.011256

Using combination of actions in reinforcement learning

Publicaciones similares