From Wikipedia, discretization is the process of transferring continuous values into discrete states. This can be done because of memory/space requirements that come up with continuous states.

In a simple q-learning environment you would have a grid of X spaces. In the Frozen Lake environment you have a 4×4 grid. So, you would set up your learner with your X actions and their expected values across 16 state spaces. You would then proceed to run the q-learning process (get action, retrieve next state and reward, etc) until you have the optimal policy.

But, when you move to a problem that doesn’t have a discrete state space you will need to discretize it. A simple array sample can be shown with numpy’s digitize call. I will walk through it.

Create your “continuous” state space (image this is huge)

state_space = np.array([0,1,2,3,4,5,6,7,8,9])

Create the bins that you can handle (lets assume that you can only store 3 spaces)

bins = np.array([0,2,7])

Now, run the digitize to combine all the numbers in state_space into their new discretized space

dis = np.digitize(state_space,bins)
dis = array([1, 1, 2, 2, 2, 2, 2, 3, 3, 3], dtype=int64)

This print out shows that numbers 0,1 are in new space 0. 2,3,4,5,6 are in space 1 and 7,8,9 are in space 2.

You can go into more depth if you don’t know the numbers. This is what I had to do with the Machine Learning for Trading grad class when it revolved around technical indicators. In that case, we took it a step further by using the digitized values from 4 indicators and then concatenated them. For example, I used an indicator determining if it was + or – from the previous day. If it was + I would give it a 1. If it was negative I would give it a 0. I would then do the same thing with if it was above an simple moving average. That would lead me to have a positive in the first indicator and a positive in the second indicator returning 11 as my state. If it was + and then -, it would have been 10. And so on.

Discretization allows you to put massive state spaces into something you can store.

### Like this:

Like Loading...