Due: last class.
You have a n x n minefield with n mines (start with n = 10), the mines are distributed at random. You have to reach the corner (n-1, n-1); there is an extra state called "sink", aside from these n*n states. You have, at each state, 4 moves: 1, 2, 3, 4 (or right, up, left, down, or whatever - the order does not matter). If you move off the n x n board, you to to the sink. Once in the sink, you stay in the sink. If you are at position x,y and play "right", then with probability q (more on this later) you go to position x+1,y (similarly for the other directions), and with probability 1-q, you "blow up" and go to the sink. The probability q, of not blowing up, is computed as follows. Each mine j, at coordinates x_j, y_j, contributes to a probability of blowing up
1 - q_j = 1 / (1 + sigma * (((x+1) - x_j)2 + (y - y_j)2))
where sigma is a "mine width factor" that we will trim for good pyrotechnical effect; I advise you to start with something like sigma = 0.5. The overall probability of not blowing up is then given by
- q = \prod_j q_j
You must compute a strategy that reaches the goal with maximal probability, creating plots of:
- The probability of blowing up at every square
- The optimal move at every square.
The problem can be modeled as a Markov decision process (MDP). You can solve it in two ways:
Value iteration. You can implement the PPre value-iteration operator, and compute the value fixpoint. Once you reach a fixpoint within a desired accuracy (e.g., 10-5), you can easily derive the optimal strategy.
Strategy iteration (also known as policy iteration). To compute the value of a policy from all states, my suggestion is to simulate the MDP under that policy. If you think at it a moment, you will see that this leads to a simple and efficient algorithm.
For this homework, you can use matlab, octave, scilab, or something else; I recommend scilab as it has the best graphics of the two. See the Notes on Scilab for more information on scilab. Another perfectly good solution is to write your code in any decent language (i.e., Python, Ocaml, C, or Java), and have it output files that gnuplot or similar software is then able to use for the plots.
You should submit a tarball containing the code, and the produced graphics.
