If the keyword length is five, every fifth letter in the ciphertext
have been shifted the same distance. This means that for the offsets
0, 1, 2, 3, 4, the text obtained by reading every fifth word will be
an ordinary Caesar shift of the plaintext. Given a string
**x** of *n* characters, the index of coincidence of
**x**, denoted
*I*_{c}(**x**) is defined as the probability
that two random elements of
**x** are identical. If the
frequencies of the 26 letters in
**x** are denoted
*f*_{0},..., *f*_{25}, this probability can be written

I_{c}(x) = . |
(1) |

For a completely random string this value will be approximately 1/26 0.038. However, if we let

I_{c}(x)
p_{i}^{2}
0.065. |
(2) |

If the different letters in a text are shifted with different shifts this text will look like random text. However, if they are shifted with the same shift the index of coincidence should be that of the plaintext. If we compute the mean of the index of coincidence for the text obtained by reading every fifth letter for the five offsets we thus expect that the index of coincidence will be closer to 0.065 than to 0.038 if five is the correct keyword length. This is also what we observe above.

The next problem is to determine the difference in shifts between the
adjacent letters in the text. This is done in a similar way. Let
**x** and
**y** be two strings with *n*_{x} and *n*_{y}letters, respectively. The mutual index of coincidence of
**x**and
**y**, denoted
*MI*_{c}(**x**,**y**), is defined
as the probability that a random element of
**x** is identical
to a random element of
**y**. If
*f*^{x}_{0},..., *f*^{x}_{25} and
*f*^{y}_{0},..., *f*^{y}_{25} is the frequencies of the 26 letters in
**x** and
**y**, respectively, this probability can be
written

MI_{c}(x,y) = . |
(3) |

Assume once more that the keyword length is five. Let

MI_{c}(y_{i},y_{j})
p_{h - ki}, p_{h - kj} = p_{h}, p_{h + ki - kj} |
(4) |

where the subscripts are reduced modulo 26. If we test all possible relative shifts of two strings of English text we will see that when the relative shift is 0, the mutual coincidence will be approximately 0.065; and otherwise it lies between 0.030 and 0.045. This means that if we test all possible values for

Now we only have 26 possible keywords and plaintexts, and a visual inspection easily reveals the solution.