Markov Decision Processes Save

Implementation of value iteration algorithm for calculating an optimal MDP policy

Project README

Markov Decision Processes

A sequential decision problem for a fully observable, stochastic environment with a Markovian transition model and additive rewards is called a Markov decision process, or MDP, and consists of a set of states (with an initial state); a set ACTIONS(s) of actions in each state; a transition model P (s | s, a); and a reward function R(s). A solution must specify what the agent should do for any state that the agent might reach. A solution of this kind is called a policy. This project implements value iteration, for calculating an optimal policy. The basic idea is to calculate the utility of each state and then use the state utilities to select an optimal action in each state.

Usage

python mdp.py transition_file reward_file gamma epsilon

transition_file contains tuple (state, action, result-state, probability)

reward_file contains tuple (state, reward)

gamma is discount factor

epsilon is max error in utility

Open Source Agenda is not affiliated with "Markov Decision Processes" Project. README Source: sachinbiradar9/Markov-Decision-Processes

Stars

Open Issues

Last Commit

6 years ago

Repository

sachinbiradar9/Markov-Decision-Processes

License

GPL-3.0

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/markov-decision-processes"><img src="https://www.opensourceagenda.com/projects/markov-decision-processes/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022