Breaking the Vigenère Cipher

The Vigenère cipher is a very known cipher for centuries, you can read more about it from here. It uses a series of Caesar ciphers to encrypt the text. Sometime ago I came across a challenge in breaking the Vigenère cipher. But this was a variant of a Vigenère cipher which uses XOR gate instead of normal polyalphabetic substitution. Here is the encryption program. In a challenge you won’t get the encryption program but however you can code it.

#include <stdio.h>
#define KEY_LENGTH 7 
/*
* Title: Variant of a Vignere Cipher Encrypter
* Author: Osanda Malith Jayathissa
* Website: https://osandamalith.wordpress.com
*/
int main() {

  FILE *fpIn, *fpOut;
  unsigned char ch;
  unsigned char key[KEY_LENGTH] = {0xBA, 0x1F, 0x91, 0xB2, 0x53, 0xCD, 0x3E};

  fpIn = fopen("ptext.txt", "r");
  fpOut = fopen("ctext.txt", "w");

  for (size_t i = 0; fscanf(fpIn, "%c", &ch) != EOF; ++i) 
  if (ch!='\n') fprintf(fpOut, "%02X", ch ^ key[i % KEY_LENGTH]);   

  fclose(fpIn);
  fclose(fpOut);
  
  return 0;
} 


In Python:

# Title: Variant of a Vignere Cipher Encrypter
# Author: Osanda Malith Jayathissa
# Website: https://osandamalith.wordpress.com

def Crypt(key, Str):
        kIdx = 0
        cryptStr = ''
        for x in range(len(Str)):
          if '\n' not in Str[x]:
            cryptStr += chr(ord(Str[x]) ^ ord(key[kIdx]))
            kIdx = (kIdx + 1) % len(key)

        return cryptStr

Hexify = lambda x:''.join([hex(ord(c))[2:].zfill(2) for c in x]).upper()

if __name__ == "__main__":
    key = "\xAB\x3F\x92\xB2\x53\xCD\x2E" # Key of len 7
    ptext = "Your text goes here"
    print("Encrypted Text is :\n")
    print (Hexify(Crypt(key, ptext)))

In the above code we will open the plain text file and start XORing each byte with the key. We won’t XOR new lines since the complexity would be high.
I would be very grateful to Dimitrios Kalemis for teaching me this method of breaking the Vigenere cipher 🙂
I will be using Excel and VBA for statistical analysis. Let’s try to break the following cipher text:

FB50F7C621B40EC24CB2C63BA80EC75EFCD526AC49CE1FFDD473B946CE1FFBDF32AA47C55EE6DB3CA30ECA51F69227A54B8B4FF3C120A441C54CBC921AB90ED95AFED327A85D8B4BFD9224A54FDF5AE4D721ED49C249F7C173A443C65AF6DB32B94B8B4FFED732BE5BD95AB2DD21ED5ECA56FC9227A20EDF57F7923BB843CA51B2DF3AA34A851FDBC673AE41C65AE1923BA243CE1FE6DD73B946CE1FF0DD20A243D81FF3DC37ED4CDE4CFBDC36BE5DCE4CB2DD35ED43CE51A99235A25C8B51FDC63BA440CC1FF0C727ED59C35EE69230A243CE4CB2DA3CA04B8B4BFD9227A54BC61FFBDC73B946CE1FFFDD20B90ECC5AFCD721AC428B5EFCD673A440DF5AFEDE3AAA47C953F79220A54FDB5AB2D132A30EC95AB2D373BE5BC955F7D127ED48C44DB2C23CA85AD946BC9203A24BDF4DEB923ABE0EDF57F79226A347DD5AE0C132A10EC75EFCD526AC49CE1FE5DA3AAE468B4BFAD773A54BCA4DE6923BA242CF4CB2C53AB9468B51F3C626BF4B8B5EFCD673A45AD85AFED47DED66CE1FE5DA3CED46CA4CB2D373AE41C54BF7DF23B90ECD50E09223A24BDF4DEB9230AC40C550E6923BAC58CE1FFFC730A50ED95AE1C236AE5A8B59FDC073A547C64CF7DE35E10EC44DB2D43CBF0ECA51EBC63BA440CC1FF7DE20A8008B68FAD721A858CE4DB2C63BA85CCE1FFBC173AC0ED85AFCC136ED41CD1FF0D732B85AD213B2DD21ED5EC448F7C07FED41D91FFAD321A041C546BE9232BE0EC251B2C63BA80EC650E6DB3CA30EC459B2D373BA4FDD5AB2DD35ED5AC35AB2C136AC028B56FC9227A54B8B58E0DD24B9468B50F49232ED48C750E5D721E10EDF57F7C036ED47D81FE2DD36B95CD21FFBDC73A45AD81FF0DB21B94685

We perform statistical analysis in each column. For place the bytes in columns corresponding to length of the key. Since the key length is assumed to be something in between 1 and 14 we have to write 14 macros to place the bytes in order. After that we sort the bytes in the highest occurring order. If we find columns with all bytes occurring few time then it’s not a valid key length. If we find columns where only few bytes occur many times that means it may correspond to the frequently occurring letters such as a space or the letter “e” in the English alphabet.
We can write a macro for excel to place the cipher text on columns ranging from 1 to 14.

Sub Cipher()
   ' Website: https://osandamalith.wordpress.com
   Dim myString As String
   Dim myCut As String
   Dim i As Long
   Dim row As Long
   Dim col As Long
   
   myString = "FB50F7C621B40EC24CB2C63BA80EC75EFCD526AC49CE1FFDD473B946CE1FFBDF32AA47C55EE6DB3CA30ECA51F69227A54B8B4FF3C120A441C54CBC921AB90ED95AFED327A85D8B4BFD9224A54FDF5AE4D721ED49C249F7C173A443C65AF6DB32B94B8B4FFED732BE5BD95AB2DD21ED5ECA56FC9227A20EDF57F7923BB843CA51B2DF3AA34A851FDBC673AE41C65AE1923BA243CE1FE6DD73B946CE1FF0DD20A243D81FF3DC37ED4CDE4CFBDC36BE5DCE4CB2DD35ED43CE51A99235A25C8B51FDC63BA440CC1FF0C727ED59C35EE69230A243CE4CB2DA3CA04B8B4BFD9227A54BC61FFBDC73B946CE1FFFDD20B90ECC5AFCD721AC428B5EFCD673A440DF5AFEDE3AAA47C953F79220A54FDB5AB2D132A30EC95AB2D373BE5BC955F7D127ED48C44DB2C23CA85AD946BC9203A24BDF4DEB923ABE0EDF57F79226A347DD5AE0C132A10EC75EFCD526AC49CE1FE5DA3AAE468B4BFAD773A54BCA4DE6923BA242CF4CB2C53AB9468B51F3C626BF4B8B5EFCD673A45AD85AFED47DED66CE1FE5DA3CED46CA4CB2D373AE41C54BF7DF23B90ECD50E09223A24BDF4DEB9230AC40C550E6923BAC58CE1FFFC730A50ED95AE1C236AE5A8B59FDC073A547C64CF7DE35E10EC44DB2D43CBF0ECA51EBC63BA440CC1FF7DE20A8008B" & _
   "68FAD721A858CE4DB2C63BA85CCE1FFBC173AC0ED85AFCC136ED41CD1FF0D732B85AD213B2DD21ED5EC448F7C07FED41D91FFAD321A041C546BE9232BE0EC251B2C63BA80EC650E6DB3CA30EC459B2D373BA4FDD5AB2DD35ED5AC35AB2C136AC028B56FC9227A54B8B58E0DD24B9468B50F49232ED48C750E5D721E10EDF57F7C036ED47D81FE2DD36B95CD21FFBDC73A45AD81FF0DB21B94685"

   i = 1
   row = 1
   col = 1
   
   Do
   
      myCut = Mid(myString, i, 2)
      i = i + 2
   
      Cells(row, col) = myCut
   
      col = col + 1
      
      If (col = 8) Then
         col = 1
         row = row + 1
      End If
   
      If (i >= Len(myString)) Then
         Exit Do
      End If
   
   Loop

End Sub


Change the value in here “If (col = 8)” to change the column count. Sort the cipher text in each column and you will see the most occurring few bytes. They should be equal to spaces in the plain text. In my cipher text in we can guess the key length should be equal to 7 since we find the most occurring bytes when the column count is 7. So we XOR the most occurring bytes with 0x20 in each column and find the key. Through trial and error we have to test this. Sometimes in some columns different bytes can be frequently occur but still you can guess it  In my case the most occurring bytes in each column are ‘8b’, ‘1f’, ‘b2′, ’92’, ’73’, ‘ed’, ‘0e’. Here is a small program that would XOR the given bytes with 0x20 which is space and provide us the key bytes.

#include <stdio.h>
#define KEY_LENGTH 7 
/*
* Title: Variant of a Vignere Cipher Key Decrypter
* Author: Osanda Malith Jayathissa
* Website: https://osandamalith.wordpress.com
*/
int main() {
	unsigned char space = '\x20';
  	unsigned char bytes[KEY_LENGTH] = {0x8b, 0x1f, 0xb2, 0x92, 0x73, 0xed, 0x0e};
	size_t i;
  	
  	printf("Key of length %d. Key bytes are:\n\n", sizeof(bytes));
	for (i = 0; i < sizeof(bytes); ++i) 	
	fprintf(stdout, "%02X\n", space ^ bytes[i]);   
  
  	return 0;
} 
# Title: Variant of a Vignere Cipher Key Decrypter
# Author: Osanda Malith Jayathissa
# Website: https://osandamalith.wordpress.com

def HeXOR(a, b):  
    return "".join(["%x" % (int(x,16) ^ int(y,16)) for (x, y) in zip(a, b)])


# The most occuring bytes from the column
Bytes = ['8b', '1f', 'b2', '92', '73', 'ed', '0e']

print("Key of length %d. Key bytes are:\n") %len(Bytes)
for i in Bytes:
	print HeXOR(i,'20') # Assuming space '0x20' as the most occuring letter

Column Cipher character corresponding to space Key value = cipher character XOR Hex 20 (space)
1 8B 0x8B XOR 0x20 = 0xAB
2 1F 0x1F XOR 0x20 = 0x3F
3 B2 0xB2 XOR 0x20 = 0xB2
4 92 0x92 XOR 0x20 = 0xB2
5 73 0x73 XOR 0x20 = 0x53
6 ED 0xED XOR 0x20 = 0xCD
7 0E 0x0E XOR 0x20 = 0x2E

We can decrypt the cipher text using Excel macros as follows or perhaps write a program to do it 🙂

Sub Xoring()
   ' Website: https://osandamalith.wordpress.com
   Dim i As Long
   Dim row As Long
   Dim col As Long
   
   row = 1
   col = 1
   
   Do
      
      If col = 1 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&H8B") Xor Val("&H20"))
      End If
      
      If col = 2 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&H1F") Xor Val("&H20"))
      End If
      
      If col = 3 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&HB2") Xor Val("&H20"))
      End If
      
      If col = 4 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&H92") Xor Val("&H20"))
      End If
      
      If col = 5 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&H73") Xor Val("&H20"))
      End If
      
      If col = 6 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&HED") Xor Val("&H20"))
      End If
      
      If col = 7 Then
         Cells(row, col + 10) = Chr(Val("&H" & (Cells(row, col))) Xor Val("&H0E") Xor Val("&H20"))
      End If
      
      col = col + 1
      
      If Cells(row, col) = "" Then
         If (col = 8) Then
            col = 1
            row = row + 1
         Else
            Exit Do
         End If
      End If
   Loop
   
End Sub

In C


#include <stdio.h>
#define KEY_LENGTH 7
/*
* Title: Variant of a Vignere Cipher Decrypter
* Thank you hasherezade !
* Website: https://osandamalith.wordpress.com
*/ 
int main(int argc, char* argv[])
{
    FILE *fpIn, *fpOut;
    unsigned char key[KEY_LENGTH] = {0xAB, 0x3F, 0x92, 0xB2, 0x53, 0xCD, 0x2E};
    
    if (argc < 3) {
        printf("Invalid args, format <inFile> <outFile>\n");
        return -1;
    }
 
    char *inFileName = argv[1];
    char *outFileName = argv[2];
    fpIn = fopen(inFileName, "r");
    fpOut = fopen(outFileName, "w");
 
    if ( !fpIn || !fpOut) {
        printf("Cannot open the file!\n");
        return -1;
    }
    int enc = 0;
    for (size_t i = 0; !feof(fpIn); ++i) {
        if (!fscanf(fpIn, "%02X", &enc)) {
            printf("cannot scan\n");
            break;
        }
        fprintf(fpOut, "%c", enc ^ key[i % KEY_LENGTH]); 
    }
    fclose(fpIn);
    fclose(fpOut);
    printf("Done! Output saved to: %s\n", outFileName);
    return 0;
}

In python

# Title: Variant of a Vignere Cipher - Decryption
# Author: Osanda Malith Jayathissa
# Website: https://osandamalith.wordpress.com

def decrypt(key, Str):
    a = kIdx = 0
    b = 2
    decryptStr = ''
    for x in range(len(Str)):
        if not b == len(Str):
            decryptStr += chr(int('0x'+Str[a:b],0) ^ ord(key[kIdx]))
            kIdx = (kIdx + 1) % len(key)
            a += 2
            b += 2

    return decryptStr

if __name__ == "__main__":
    key = "\xAB\x3F\x92\xB2\x53\xCD\x2E" # Key of len 7
    ctext = "FB50F7C621B40EC24CB2C63BA80EC75EFCD526AC49CE1FFDD473B946CE1FFBDF32AA47C55EE6DB3CA30ECA51F69227A54B8B4FF3C120A441C54CBC921AB90ED95AFED327A85D8B4BFD9224A54FDF5AE4D721ED49C249F7C173A443C65AF6DB32B94B8B4FFED732BE5BD95AB2DD21ED5ECA56FC9227A20EDF57F7923BB843CA51B2DF3AA34A851FDBC673AE41C65AE1923BA243CE1FE6DD73B946CE1FF0DD20A243D81FF3DC37ED4CDE4CFBDC36BE5DCE4CB2DD35ED43CE51A99235A25C8B51FDC63BA440CC1FF0C727ED59C35EE69230A243CE4CB2DA3CA04B8B4BFD9227A54BC61FFBDC73B946CE1FFFDD20B90ECC5AFCD721AC428B5EFCD673A440DF5AFEDE3AAA47C953F79220A54FDB5AB2D132A30EC95AB2D373BE5BC955F7D127ED48C44DB2C23CA85AD946BC9203A24BDF4DEB923ABE0EDF57F79226A347DD5AE0C132A10EC75EFCD526AC49CE1FE5DA3AAE468B4BFAD773A54BCA4DE6923BA242CF4CB2C53AB9468B51F3C626BF4B8B5EFCD673A45AD85AFED47DED66CE1FE5DA3CED46CA4CB2D373AE41C54BF7DF23B90ECD50E09223A24BDF4DEB9230AC40C550E6923BAC58CE1FFFC730A50ED95AE1C236AE5A8B59FDC073A547C64CF7DE35E10EC44DB2D43CBF0ECA51EBC63BA440CC1FF7DE20A8008B68FAD721A858CE4DB2C63BA85CCE1FFBC173AC0ED85AFCC136ED41CD1FF0D732B85AD213B2DD21ED5EC448F7C07FED41D91FFAD321A041C546BE9232BE0EC251B2C63BA80EC650E6DB3CA30EC459B2D373BA4FDD5AB2DD35ED5AC35AB2C136AC028B56FC9227A54B8B58E0DD24B9468B50F49232ED48C750E5D721E10EDF57F7C036ED47D81FE2DD36B95CD21FFBDC73A45AD81FF0DB21B94685"
    print("Decrypted Text is :\n")
    print (decrypt(key, ctext))

So this is the plain text 🙂

Poetry is the language of the imagination and the passions. It relates to whatever gives immediate pleasure or pain to the human mind. It comes home to the bosoms and businesses of men; for nothing but what comes home to them in the most general and intelligible shape can be a subject for poetry. Poetry is the universal language which the heart holds with nature and itself. He who has a contempt for poetry cannot have much respect for himself, or for anything else. Wherever there is a sense of beauty, or power, or harmony, as in the motion of a wave of the sea, in the growth of a flower, there is poetry in its birth.

This method of breaking is easy, but there are more other ways you could break. In cryptanalysis of the Vigenère cipher there are two main steps. First we try to find the length of the key by calculating the index of coincidence I.C = \frac{\sum_{i = A}^{i = Z}f_i{(f_i-1)} } {N(N-1)}
To identify the specific key we use the Chi-Squared statistic.
X^{2}(C,E) =\sum_{i = A}^{i = Z}\frac{(C_{i}- E_{i}) ^{2} }{E_{i} }
Here is a full implementation using python which uses the above two statistics and break the cipher.

Special thanks to hasherezade for helping me with coding.

Thanks for reading!

Advertisements

4 thoughts on “Breaking the Vigenère Cipher

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s