Dynamic programming for variable discounted Markov decision problems
- Autores
- Della Vecchia, Eugenio; Di Marco, Silvia; Vidal, Fernando
- Año de publicación
- 2014
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- We study the existence of optimal strategies and value function of non stationary Markov decision processes under variable discounted criteria, when the action space is assumed to be Borel and the action space to be compact. With this new way of defining the value of a policy, we show existence of Markov deterministic optimal policies in the finite-horizon case, and a recursive method to obtain such ones. For the infinite horizon problem we characterize the value function and show existence of stationary deterministic policies. The approach presented is based on the use of adequate dynamic programming operators.
Sociedad Argentina de Informática e Investigación Operativa (SADIO) - Materia
-
Ciencias Informáticas
Markov decision processes
variable discount factor
Programming Environments
Decision problems
dynamic programming - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by/3.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/41704
Ver los metadatos del registro completo
id |
SEDICI_7feb224f81c8bf5ed84394fb74a99a15 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/41704 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Dynamic programming for variable discounted Markov decision problemsDella Vecchia, EugenioDi Marco, SilviaVidal, FernandoCiencias InformáticasMarkov decision processesvariable discount factorProgramming EnvironmentsDecision problemsdynamic programmingWe study the existence of optimal strategies and value function of non stationary Markov decision processes under variable discounted criteria, when the action space is assumed to be Borel and the action space to be compact. With this new way of defining the value of a policy, we show existence of Markov deterministic optimal policies in the finite-horizon case, and a recursive method to obtain such ones. For the infinite horizon problem we characterize the value function and show existence of stationary deterministic policies. The approach presented is based on the use of adequate dynamic programming operators.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2014-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf50-62http://sedici.unlp.edu.ar/handle/10915/41704enginfo:eu-repo/semantics/altIdentifier/url/http://43jaiio.sadio.org.ar/proceedings/SIO/17.pdfinfo:eu-repo/semantics/altIdentifier/issn/1850-2865info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/3.0/Creative Commons Attribution 3.0 Unported (CC BY 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:33:54Zoai:sedici.unlp.edu.ar:10915/41704Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:33:55.101SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Dynamic programming for variable discounted Markov decision problems |
title |
Dynamic programming for variable discounted Markov decision problems |
spellingShingle |
Dynamic programming for variable discounted Markov decision problems Della Vecchia, Eugenio Ciencias Informáticas Markov decision processes variable discount factor Programming Environments Decision problems dynamic programming |
title_short |
Dynamic programming for variable discounted Markov decision problems |
title_full |
Dynamic programming for variable discounted Markov decision problems |
title_fullStr |
Dynamic programming for variable discounted Markov decision problems |
title_full_unstemmed |
Dynamic programming for variable discounted Markov decision problems |
title_sort |
Dynamic programming for variable discounted Markov decision problems |
dc.creator.none.fl_str_mv |
Della Vecchia, Eugenio Di Marco, Silvia Vidal, Fernando |
author |
Della Vecchia, Eugenio |
author_facet |
Della Vecchia, Eugenio Di Marco, Silvia Vidal, Fernando |
author_role |
author |
author2 |
Di Marco, Silvia Vidal, Fernando |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Markov decision processes variable discount factor Programming Environments Decision problems dynamic programming |
topic |
Ciencias Informáticas Markov decision processes variable discount factor Programming Environments Decision problems dynamic programming |
dc.description.none.fl_txt_mv |
We study the existence of optimal strategies and value function of non stationary Markov decision processes under variable discounted criteria, when the action space is assumed to be Borel and the action space to be compact. With this new way of defining the value of a policy, we show existence of Markov deterministic optimal policies in the finite-horizon case, and a recursive method to obtain such ones. For the infinite horizon problem we characterize the value function and show existence of stationary deterministic policies. The approach presented is based on the use of adequate dynamic programming operators. Sociedad Argentina de Informática e Investigación Operativa (SADIO) |
description |
We study the existence of optimal strategies and value function of non stationary Markov decision processes under variable discounted criteria, when the action space is assumed to be Borel and the action space to be compact. With this new way of defining the value of a policy, we show existence of Markov deterministic optimal policies in the finite-horizon case, and a recursive method to obtain such ones. For the infinite horizon problem we characterize the value function and show existence of stationary deterministic policies. The approach presented is based on the use of adequate dynamic programming operators. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/41704 |
url |
http://sedici.unlp.edu.ar/handle/10915/41704 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://43jaiio.sadio.org.ar/proceedings/SIO/17.pdf info:eu-repo/semantics/altIdentifier/issn/1850-2865 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/3.0/ Creative Commons Attribution 3.0 Unported (CC BY 3.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by/3.0/ Creative Commons Attribution 3.0 Unported (CC BY 3.0) |
dc.format.none.fl_str_mv |
application/pdf 50-62 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1842260187740635136 |
score |
13.13397 |