What if I told you that it had been mathematically proven that the orange and red properties in Monopoly by Hasbro are the most profitable (i.e., they are the most landed-on properties)?
And when playing the game of RISK, the best strategy for world domination is to attack when you have twice as many armies on a neighboring territory and if the number of troops your opponent has is at most half your own.
How do we know this?
Markov Chains!
Now, we briefly looked at linear difference equations and migration matrices, which lay the foundation of Markov Chains, in our lesson dealing with Applications of Linear Systems and Linear Models in Business. Still, this lesson will dig deeper and allow us to see how Markov Chains, along with stochastic matrices, transition, and migration matrices, help us to study chance processes and give us insight into how the outcome of a given experiment can affect the result of the following experiment.
In fact, modern probability theory studies chance processes (i.e., the likelihood of occurrence) for which the knowledge of previous outcomes influences predictions for future experiments and is used for such things as:
- Predicting the weather conditions given the weather on a preceding day
- A student’s grades based on a sequence of exams in a course
- How population moves or migrates from one place to another
- Classic board games such as Chutes and Ladders, Monopoly, and Risk
Understanding Probability Vectors and Stochastic Matrices
To get a better understanding of all of this, let’s first talk about the fact that a probability vector \(\vec{x}\) in \(\mathbb{R}^{n}\) is a non-negative vector where entries are probabilities that add up to 1 (i.e., 100%). And a stochastic matrix \(\mathrm{P}\) is an \(n \times n\) matrix whose columns are probability vectors.
How Markov Chains Work
Therefore, a Markov Chain is a sequence of probability vectors, \(\overrightarrow{x_{0}}, \overrightarrow{x_{1}}, \ldots\) where \(\overrightarrow{x_{0}}\) is the initial state, together with a stochastic matrix \(\mathrm{P}\), such that \(\quad \overrightarrow{x_{1}}=P \overrightarrow{x_{0}}, \quad \overrightarrow{x_{2}}=P \overrightarrow{x_{1}}, \quad \overrightarrow{x_{3}}=P \overrightarrow{x_{2}}, \quad \cdots\).
Thus, a Markov chain is a first-order difference equation representing probabilities and is described as \(\overrightarrow{x_{K+1}}=P \overrightarrow{x_{k}}\) for \(k=0,1,2, \ldots\), where \(\overrightarrow{x_{k}}\) is called the state vector (steady state vector).
Alright, so let’s practice creating a stochastic matrix \(P\).
An Example: Predicting Ice Cream Sales
Suppose an ice cream shop offers three flavors: chocolate, strawberry, and vanilla, and is regularly frequented by local customers. In the following directed graph, vertices (nodes) represent the three states (i.e., chocolate, strawberry, and vanilla), and the arrow represents the transition and their probability.
The graph above shows that seventy percent of those who ordered a chocolate cone on their last visit will do so again on their next visit. Still, ten percent who previously chose chocolate will choose strawberry next time, and twenty percent will choose vanilla.
Using the direct graph, we will make the following stochastic matrix where each column equals 1 and demonstrates the likelihood of a current state (ice cream choice).
\begin{align*}
P=\left[\begin{array}{lll}
0.7 & 0.3 & 0.1 \\
0.1 & 0.4 & 0.1 \\
0.2 & 0.3 & 0.8
\end{array}\right]
\end{align*}
Now let us assume that the outcome of today’s ice cream sales shows that \(55 \%\) of customers chose chocolate, \(5 \%\) chose strawberry, and \(40 \%\) selected vanilla.
\begin{align*}
\overrightarrow{x_{0}}=\left[\begin{array}{l}
0.55 \\
0.05 \\
0.40
\end{array}\right]
\end{align*}
So, using this information, let’s see if we can predict what tomorrow’s ice cream sales are likely to be knowing \(\overrightarrow{x_{K+1}}=P \overrightarrow{x_{k}}\)
\begin{align*}
\overrightarrow{x_{2}}=\left[\begin{array}{lll}
0.7 & 0.3 & 0.1 \\
0.1 & 0.4 & 0.1 \\
0.2 & 0.3 & 0.8
\end{array}\right]\left[\begin{array}{l}
0.55 \\
0.05 \\
0.40
\end{array}\right]=\left[\begin{array}{c}
0.44 \\
0.115 \\
0.445
\end{array}\right]
\end{align*}
This indicates that \(44 \%\) of customers will select chocolate, \(11.5 \%\) will choose strawberry, and \(44.5 \%\) will desire vanilla.
But wouldn’t it be really nice if the ice cream shop could accurately predict the likelihood of ice cream choices for its customers? This way, they make enough of the preferred ice cream without having it left over.
From State Vectors to Steady State Vectors
Introducing the steady state vector, sometimes called the equilibrium vector!
The most exciting aspect of Markov Chains is that it does not depend upon which state the chain was in before the current state, and it can accurately predict long-term behavior.
If \(\mathrm{P}\) is a stochastic matrix, then the state vector for \(\mathrm{P}\) is a probability vector \(\mathrm{q}\) such that \(\overrightarrow{P q}=\vec{q}\).
But rather than computing numerous iterations of our Markov chain, \(\overrightarrow{x_{K+1}}=P \overrightarrow{x_{k}}\) for \(k=0,1,2, \ldots\), we’re going to cut to the chase and find out when there is no longer a change in the system from one choice to the next.
This is done by utilizing the following technique:
\begin{aligned}
\overrightarrow{P \vec{x}} &= \vec{x} \\
\overrightarrow{P x} – \vec{x} &= \overrightarrow{0} \\
(P-I) \vec{x} &= \overrightarrow{0}
\end{aligned}
Thus, we can find the steady state vector for \(\mathrm{P}\) by row reducing the augmented matrix \(\vec{q}=\left[\begin{array}{ll}(P-I) & \overrightarrow{0}\end{array}\right]\)
Calculating the Steady State Vector: A Practical Example
Okay, so using our ice cream shop example, let’s calculate the steady state vector if
\begin{align*}
P &= \left[\begin{array}{lll}
0.7 & 0.3 & 0.1 \\
0.1 & 0.4 & 0.1 \\
0.2 & 0.3 & 0.8
\end{array}\right]
\end{align*}
First, we find \(P-I\)
\begin{aligned}
P-I &= \left[\begin{array}{ccc}
0.7 & 0.3 & 0.1 \\
0.1 & 0.4 & 0.1 \\
0.2 & 0.3 & 0.8
\end{array}\right] – \left[\begin{array}{ccc}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{array}\right] \\
&= \left[\begin{array}{ccc}
-0.3 & 0.3 & 0.1 \\
0.1 & -0.6 & 0.1 \\
0.2 & 0.3 & -0.2
\end{array}\right]
\end{aligned}
Now we will augment the matrix with the zero vector and row reduce
\begin{aligned}
\left[\begin{array}{cccc}
-0.3 & 0.3 & 0.1 & 0 \\
0.1 & -0.6 & 0.1 & 0 \\
0.2 & 0.3 & -0.2 & 0
\end{array}\right] &\sim \left[\begin{array}{cccc}
1 & 0 & -0.6 & 0 \\
0 & 1 & -0.267 & 0 \\
0 & 0 & 0 & 0
\end{array}\right]
\end{aligned}
Next, we will write our findings in parametric vector form.
\begin{aligned}
\left.\begin{array}{c}
x_{1} &= 0.6 x_{3} \\
x_{2} &= 0.267 x_{3} \\
x_{3} &= x_{3}
\end{array}\right\} \quad \vec{x} = x_{3}\left[\begin{array}{c}
0.6 \\
0.267 \\
1
\end{array}\right]
\end{aligned}
Lastly, we will take our solution and make it stochastic. Remember, the sum of the column vector must total 1. Therefore, we will sum the column and divide each entry by that amount.
\begin{align*}
\underbrace{\left[\begin{array}{c}
0.6 \\
0.267 \\
1
\end{array}\right]}_{\text {total }=1.867} \Rightarrow\left[\begin{array}{c}
0.6 / 1.867 \\
0.267 / 1.867 \\
1 / 1.867
\end{array}\right]=\left[\begin{array}{l}
0.32 \\
0.14 \\
0.54
\end{array}\right]
\end{align*}
This yields us a steady-state vector that predicts the future sales of ice cream to be approximately \(32 \%\) chocolate, \(14 \%\) strawberry, and \(54 \%\) vanilla. Vanilla is clearly the preferred ice cream of choice for this local shop.
Next Steps
In this lesson, you will:
- Create stochastic matrices, transition matrices, and migration matrices
- Use the process of Markov chains to predict the future
- Find the steady state vector
Get ready for an exciting experience!
Video Tutorial w/ Full Lesson & Detailed Examples
Get access to all the courses and over 450 HD videos with your subscription
Monthly and Yearly Plans Available
Still wondering if CalcWorkshop is right for you?
Take a Tour and find out how a membership can take the struggle out of learning math.