XOR cipher is a simple additive encryption technique in itself but is used commonly in other encryption techniques. The truth table for XOR cipher is as below. If the bits are same then the result is 0 and if the bits are different then the result is 1.

 

Bit 1 Operation Bit 2 Result
0 0 0
1 0 1
0 0 1
1 1 0

 

Let’s take an example. We would encrypt Sun using the key 01010010 repeatedly .

Encryption
Text          |     S    |     u     |     n       |
ACII Code     |    083   |    117    |    110      |
Binary        | 01010011 |  01110101 |  01101110   |
Key           | 01010010 |  01010010 |  01010010   |
Cipher        | 00000001 |  00100111 |  00111100   |

Now if we XOR the cipher with the same key we will get back the out original text.

Decryption
Cipher        | 00000001 |  00100111 |  00111100   |  
Key           | 01010010 |  01010010 |  01010010   |
Output        | 01010011 |  01110101 |  01101110   |
ACII Code     |    083   |    117    |     110     |
Text          |     S    |     u     |      n      |

This encryption we just did was not very secure because used the same key over and over again. To make our encryption more secure we should use a unique key and not the one which is repetitive in nature. A good technique that could be used is One-time Pad. This makes the encryption much more secure to the brute force attack.

XOR encryption and decryption

The encryption and decryption using XOR has the same code. A python implementation for the same is below:

 

input_str = raw_input("Enter the cipher text or plain text: ")
key = raw_input("Enter the key for encryption or decryption: ")
no_of_itr = len(input_str)
output_str = ""


for i in range(no_of_itr):
    current = input_str[i]
    current_key = key[i%len(key)]
    output_str += chr(ord(current) ^ ord(current_key))

print "Here's the output: ", output_str

And here’s a sample run

Image showing sample run of ROT13 encoder decoder

Image showing sample run of XOR encryption and decryption

 

The entire source code for this post can be found at https://github.com/abhishuk85/cryptography-plays

Any questions, comments or feedback are most welcome.

Image showing sample run of ROT13 encoder decoder

ROT13 is a letter substitution cipher and a special case of Caesar Cipher where each character in the plain text is shifted exactly 13 places. If you are not aware of Caesar Cipher then look at Caesar Cipher. For example the cipher for SUN becomes FHA.

The cool thing about this technique is that if we do a ROT13 on the cipher text then we get back the plain text since each letter in the text is shifted by 13 places. For example when we do a ROT13 on FHA we get back SUN

 

A block representation of ROT13 encryption and decryption

A block representation of ROT13 encryption and decryption

 

ROT13 Encoder and Decoder

The encoder and Decoder for ROT13 is the same because there is no special logic during decoding since the shift for both encoding and decoding is the same. Below is the python code for the implementation of it. The code is pretty much the same as Caesar Cipher with the shift value set to 13 always.

 

alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

input_str = raw_input("Enter message that you would like to encrypt/decrypt using ROT13: ")
shift = 13
no_of_itr = len(input_str)
output_str = ""

for i in range(no_of_itr):
    current = input_str[i]
    location = alphabets.find(current)
    if location < 0:
        output_str += input_str[i]
    else:
        new_location = (location + shift)%26
        output_str += alphabets[new_location]

print "Here's the output: ", output_str

Here’s a sample run of ROT13

Image showing sample run of ROT13 encoder decoder

Image showing sample run of ROT13 encoder decoder

 

The entire source code for this post can be found at https://github.com/abhishuk85/cryptography-plays

Any questions, comments or feedback are most welcome.

Base64 is a binary to text encoding technique rather than an encryption technique but I thought it made sense to cover it in this series because it is widely used especially for transmitting the data over the wire. The reason being the set of characters selected for this encoding is a subset of most common characters in all encoding and printable characters.

Here is the Base64 index table:

Index Char Index Char Index Char Index Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /

 

The conversion of a string into Base64 happens by taking the 8-bit binary equivalent of the alphabets and then slicing it into 6-bit unit since the maximum value in the Base64 is 2^6 and then using the index table like above binary would be represented. Lets take an example of string Sun and see how it would be represented in Base64

 

Text          |     S    |     u     |     n       |
ACII Code     |    083   |    117    |    110      |
Binary        | 01010011 |  01110101 |  01101110   |
6-bit         | 010100 | 110111 | 010101 | 101110  |
Base64 Index  |   20   |    55  |   21   |   46    |
Base64 encoded|    U   |    3   |   V    |    u    |

We can verify this by converting the string with Python

>>> "Sun".encode("base64")
'U3Vu\n'

 

The newline character that we see at the end of the output is ignored. Whether we decode the string with or without the we would still get the same string back

>>> "U3Vu\n".decode("base64")
'Sun'
>>> "U3Vu".decode("base64")
'Sun'

 

The length of characters in the output has to be a multiple of 4. If it is not the case then the output is appended with either one or two “=” to make it so. For example when we convert Earth to Base64 we this in action

>>> "Earth".encode("base64")
'RWFydGg=\n'

 

Base64 Encoder

Sometimes for various reasons the strings are Base64 encoded multiple times and you might have noticed by now this increases the length of the output. The base64 encoder that I wrote using the one builtin with Python takes the number of times you would like to encode your string. The code is pretty straightforward.

 

input_str = raw_input("Enter the string that you like to be base64 encoded:")
times = int(raw_input("How deep do you want it encoded:"))

output_str = input_str

for i in range(times):
    output_str = output_str.encode("base64")

print "Encoded string: ", output_str

 

And here is a sample run

 

Image showing sample run of Base64 encoder

Image showing sample run of Base64 encoder

 

Base64 Decoder

This a where it gets a little bit trickier since while decoding I assume that I am not aware of the number of times the text was encoded. I created a base sting that contains all the valid characters in Base64 encoded strings and then take the input as base64 encoded string

 

base_64_encoding_characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="

input_str = raw_input("Enter the base64 encoded string that you would like to decode: ")

With the string to be decoded in hand we go into a while loop and run in it until we have a potential candidate for the original string. The basic logic is to try and decode the string and if fails to decode then append an “=” to its end and try again and also increase the error count in the process. We repeat this twice and keep going until we have a string that cannot be decoded.

 

while error_count < 3:
    input_str, is_end = ValidateAndSplit(input_str.replace('\n',''))

    if is_end == True:
        break;
    try:
        temp = input_str.decode("base64")
        input_str = temp
        output_str = temp
        depth = depth + 1
        error_count = 0
        print input_str
    except binascii.Error as err:
        error_count = error_count + 1
        input_str = input_str + "="

print "Potential decoded string: ", output_str, "\nWith depth: ", depth

The ValidateAndSplit method basically tries to remove unnecessary charters from the string to make sure we don’t down a bad path and also tells us when potentially we have reached the end of our search

 

def ValidateAndSplit(input_str):
    is_end = False
    n = len(input_str)
    if n < 1:
        is_end = True
        return input_str, is_end

    for i in range(n):
        c = input_str[i]
        location = base_64_encoding_characters.find(c)
        if location < 0 and c == " ":
            is_end = True
            break
        elif location < 0:
            data = input_str.split(c, 1)
            input_str = data[0]
            break

    return input_str, is_end

Here’s a sample run of this decoder with the same base64 string that we encoded before 10 times

 

Image showing sample run of Base64 decoder

Image showing sample run of Base64 decoder

 

The problem with the current approach is that if we might over decode the string that are one word only. One fix to that could be reaching out to reach out to an online dictionary and see that we have found a valid word.

 

The entire source code for this post can be found at https://github.com/abhishuk85/cryptography-plays

Any questions, comments or feedback are most welcome.

 

If you have ever been interested in cryptography or started to learn about it then there is a high probability that you would have come across Caesar Cipher. In case you haven’t, here’s how wikipedia explains it

In cryptography, a Caesar cipher, also known as Caesar’s cipher, the shift cipherCaesar’s code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by AE would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

In this post and further posts this series we would implement these encryption and decryption in Python.

Note – You would need Python installed on your system to run the code in this post

 

Caesar Cipher Encoder

This part of the algorithm is fairly straight forward. The code below asks the plain text that you like to encrypt using the Caesar Cipher and the number of alphabets that you would like to be shifted.

Then it calculates the number of characters in the input string as this will be used for the number of iterations that need to be done to encrypt everything (one character at a time)

During the iteration it uses the alphabets string as its base for shifting the characters skipping the characters that are not part of this base string to keep the structure of the input intact

Finally it outputs the ciphertext for you to use.

 

alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

input_str = raw_input("Enter message that you would like to encrypt using caesar cipher: ")
shift = int(raw_input("Enter a shift value:"))

no_of_itr = len(input_str)
output_str = ""

for i in range(no_of_itr):
    current = input_str[i]
    loc = alphabets.find(current)
    if location < 0:
        output_str += input_str[i]
    else:
        newLocation = (location + shift)%26
        output_str += alphabets[newLoc]

print "ciphertext: ", output_str

 

I am assuming that you have either already have python installed on your system or you would do it now. To encode your message using Caesar Cipher save the above code in a .py file and then int the command prompt navigate the folder where the file saved and run the command similar to below:

 

D:\>python caesar_encoder.py
Image showing sample run of Caesar Cipher Encoder

Image showing sample run of Caesar Cipher Encoder

Caesar Cipher Decoder

Just like hiding something is simple and finding it is difficult. Similarly encoding is much simpler than decoding it. Now assume that you encoded your message using Caesar Cipher but then you forgot how many character you actually shifted. In this solution we would go through creating a solution for deciphering the ciphertext for all the possible shifts and then using the Oxford dictionary API to check which of these output actually makes any sense.

The initial section of the code is straightforward. We would ask for the ciphertext and your guess of what the shift could be

alphabets = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

input_str = raw_input("Enter cipher to decode: ")
shift = int(raw_input("Enter your guess of shift: "))

n = len(input_str)
output_str = ""

 

Then we would try to decode the ciphertext using your guess and see how good bad it was

 

for k in range(n):
        c = input_str[k]
        location = alphabets.find(c)
        if location < 0:
            output_str += input_str[k]
        else:
            new_location = (location - shift)%26
            output_str += alphabets[new_location]
data = output_str.split(" ", 20)
base_output_str = output_str
base_score = getScore(data[:20])

 

Hers’s the code for the getScore function. This calculates the scores of the decoded string based on the first 20 words by reaching out to the Oxford dictionary and checking if the decoded string is actually a valid word.

 

# This function gets the meaning score of the potential plain text
def getScore(potential_plain_text):
    current_score = 0.0
    for temp in potential_plain_text:
        #Limiting the match of words with 4 alphabets or more makes it less likely for false positive
        if len(temp) > 3:
            url = base_url + language + query_divider + temp.lower() + prefix_param + limit_param
            r = requests.get(url, headers = {'app_id': app_id, 'app_key': app_key})
            json_data = json.loads(r.text)
            if json_data['metadata']['total'] > 0 and json_data['results'][0]['score'] > 0:
                current_score += json_data['results'][0]['score']*json_data['metadata']['total']
    return current_score

 

Now we loop through all the possible options of character shift and then sort the results based on score

for i in range(26):
    output_str = ""

    for j in range(n):
        c = input_str[j]
        location = alphabets.find(c)
        if location < 0:
            output_str += input_str[j]
        else:
            new_location = (location - i)%26
            output_str += alphabets[new_location]
    data = output_str.split(" ", 20)
    score = int(getScore(data[:20]))
    results.append(Result(score,i,output_str))

results.sort(key=operator.attrgetter('score'))
results.reverse()

print "Plain text from your guess: ", base_output_str, " - with shift: ", shift, " - the score is", base_score, "\n"

print "Below are the potential plain text sorted by the probability of being correct in descending order!"

for t in results:
    print "Potential plain text: ", t.plain_text, " - with shift: ", t.shift, " - the score is", t.score

 

Here’s a sample run

C:\>python caesar_decoder.py
Enter cipher to decode: WKLV LV PB PHVVDJH
Enter your guess of shift: 3

 

And here’s the a sample output

Image showing sample run of Caesar Cipher Decoder

Image showing sample run of Caesar Cipher Decoder

 

The entire source code for this post can be found at https://github.com/abhishuk85/cryptography-plays

Any questions, comments or feedback are most welcome.

 

We all understand the concept of boxing and unboxing in C# but type casting is a bit more complicated and we do have more than one option to accomplish it.

The two options that we have is 1) Have an explicit cast  to specific type or 2) Use the as keyword for type casting. Let’s look at each one of those in a little detail in code

 

I am going to call the type casting without any keyword as standard type casting.  Below is a code example of one of the ways of doing it right.

object obj1 = new object();

try
{
    Person person1 = (Person)obj1;
    Console.WriteLine(person1);
}
catch(InvalidCastException castException)
{
    Console.WriteLine(castException.Message);
}

It is apparent from the code that the object we are trying to cast to typeof Person is not actually a person and hence will be unsuccessful, so we are prepared for it by catching the InvalidCastException. This is exactly the problem when using the standard type casting.

We can avoid getting an exception during type casting by using the as keyword. Below is the code example of one way of doing that

object obj1 = new object();
Person person2 = obj1 as Person;
 if (person2 != null)
 {
     Console.WriteLine(person2);
 }
 else
 {
     Console.WriteLine("The person was not the original type of the object");
 }

As we can see from the code above, we used the as keyword to type cast and since the obj1 was is not of the type Person we will get back null as the result of the cast.

 

I know. The next question one would ask as to why is better than catching an exception since it is almost the same amount of code?

 

The answer to that question is performance. It is always expensive to throw an exception because of the additional work that needs to done by the runtime like colleting the stacktrace, increase in memory pressure sue to the page faults, etc

 

There is one catch while using the as keyword though, the as keyword only works for reference types or nullable types, which is understandable since it either returns the casted object or null and there no option to return null in case of failure for non nullable type (value type), it can’t be used there.

 

In summary, we should try to use as wherever possible as checking for null is definitely preferred as compared to throwing an exception, however if we must use type casting we should catch the specific InvalidCastException for better performance.

 

Although there is not much to the source code, regardless it could be found at github

 

Any questions, suggestion or feedback is always welcome.