Skip to content

A CANDECOMP-PARAFAC tensor decomposition method to solve a Markov Decision Process (MDP) gridworld problem.

Notifications You must be signed in to change notification settings

danielakuinchtner/cp-mdp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

A Tensor-based Markov Decision Process Solver (CP-MDP)

CP-MDP is an implementation based on the MICAI 2020 paper:

Daniela Kuinchtner, Felipe Meneguzzi and Afonso Sales.
A Tensor-Based Markov Decision Process Representation
In Advances in Soft Computing (pp. 313-324).

This implementation relies on modifying the pymdptoolbox toolkit, which uses a tabular method to represent MDP transition models and to compute the solution, to a tensor decomposition value iteration and policy iteration compact algorihtms (see mdp.py).

The compact value iteration (CP-MDP-VI), and the compact policy iteration (CP-MDP-PI), are based on the CANDECOMP/PARAFAC tensor decomposition definition (which gives the name "CP-MDP" for our solver) to build the transition models as tensor components in a compact representation.

Using GridWorld problems with n-dimensional sizes, we analyse runtime, memory, and computational cost of our method against tabular ones. We detail the results in the paper CP-MDP: A CANDECOMP-PARAFAC Decomposition Approach to Solve a Markov Decision Process Multidimensional Problem on arXiv.