Single-letter Ciphers,

Skip to content

This here’s a machine-translated text that might contain some errors!

Om kryptografioppgavene

These tasks are structured a little different than the other tasks on Piggy, kinda wonderin’ what y’all prefer! 😎

There’s gonna be a bunch of information to start with about the topics, then some tasks afterwards!

The “levels” in this task ain’t quite like the levels before, things are more split up into topics here.

Jump straight to the tasks

What in Tarnation is a “Cipher”?

Ever had a hankerin’ to write a secret message to a pal, so’s no other folks can make heads or tails of it? Then ya need a Cipher, or cipher in Norwegian! A cipher is just a method for turnin’ plain text into “code” by switchin’ characters (often letters) with other characters. The result looks like pure gibberish to them that don’t know how the code works. The whole point is that only them that knows the key (the rule for switchin’ the letters) can make the code understandable again. In other words: ciphers make secret messages possible, whether it’s childhood games with secret lingo or real spies sendin’ encrypted messages. 😄

Visste du?

The word “cipher” actually comes from an Arabic word: the word sifr, which means “zero”. Maybe ‘cause the secret code looked like nothin’ (no meanin’!) when folks couldn’t crack it!

Thar’s a whole heap o’ different kinds o’ ciphers – some use numbers, some use symbols, and modern data encryption uses mighty complicated algorithms. These here complicated algorithms require some mighty complicated math, so let’s take a look at some simpler algorithms first!

Monoalphabetic Ciphers

Let’s take a look at some o’ the simplest (and oldest) code methods there are: monoalphabetic ciphers.

Monoalphabetic might sound like a tough word, but we can break it down: mono means “one,” and alphabetic’s about the alphabet.

So, monoalphabetic ciphers are codes where ya use a single “encryption alphabet” for the whole message. That is to say, each letter in the original text always gets swapped with the same letter throughout the encrypted message.

For example, if ya decided that A should be swapped with X, then all the A’s in the text get turned into X.

Cæsar Cipher

The classic example of a monoalphabetic cipher is the Cæsar cipher (named after Julius Cæsar). This here is basically a rule ‘bout “shiftin’” all the letters a certain number of places down the alphabet. Seems like Cæsar himself used a shift of 3 letters in his secret messages. It works like this: A becomes D, B becomes E, C becomes F, and so on through the alphabet. (When ya go past Z, ya start back at A again.) A message like ABC would then become DEF if we use Cæsar’s method.

Nobody Expects a Message in Caesar Cipher?

Ceasar Cipher Meme

How the Caesar Cipher Works in Practice:

  • Pick a key: Decide on a secret number (like 3) that tells ya how many places to shift each letter.
  • Swap out each letter: For every letter in the original message, find the letter that’s that many places further along in the alphabet (with a key of 3, A becomes D, B becomes E, and so on – remember to loop back to A after Z if needed). You can also include Æ, Ø and Å, but that gets a bit more complicated.
  • Encrypted message: Replace the letters and write the new message with those “shifted” letters. Presto – you got an unreadable, secret text that only them with the key can understand!
  • To decrypt (that is, turn it back into readable text) you just do the opposite shift back. If you know the key (like 3), it’s just as easy to read the message by movin’ the letters 3 back in the alphabet.

Sikkerhet?

These codes ain’t all that secure in the long run. ‘Cause the pattern (the substitution) is fixed, a fella with enough patience or some clever tricks can pretty quick figure out the secret. For instance, there’s only a few possible shifts in the Caesar cipher, as many as the alphabet, so anyone can try ‘em all ‘til the message makes sense – or use letter frequencies to guess their way through. In other words, maybe don’t use the Caesar cipher for super-secret diary entries or state secrets 😉.

Single-alphabet ciphers be a mighty fine way to learn the principle behind encryptin’. They’re simple and show how we can use a plain rule (a key) to turn a understandin’ text into somethin’ mysterious and unreadable – and back again. So next time ya wanna send a friend a secret message, ya can use the Caesar cipher! Maybe ya’ll can make yer own variant of Caesar’s secret alphabet? 🔐✨


Tasks

Oppgave 1: Datainnsamling og forberedelse

Før du begynner å trene modellen, må du samle inn og forberede dataene. Dette innebærer å laste ned datasettet, utforske dataene for å forstå strukturen og innholdet, og deretter rense og forbehandle dataene for å gjøre dem egnet for modelltrening.

Task 1: Gatherin’ Data and Gettin’ Ready

Before ya start trainin’ the model, ya gotta gather and prepare the data. This means downloadin’ the dataset, lookin’ over the data to understand its structure and what’s inside, and then cleanin’ and preparin’ the data to make it fit for model trainin’.

Oppgave 2: Modelltrening

Når dataene er forberedt, kan du begynne å trene modellen. Dette innebærer å velge en passende modellarkitektur, definere tapsfunksjonen og optimeringsalgoritmen, og deretter trene modellen på treningsdataene.

Task 2: Trainin’ the Model

Once the data’s ready, ya can start trainin’ the model. This means pickin’ a suitable model architecture, definin’ the loss function and optimization algorithm, and then trainin’ the model on the trainin’ data.

Oppgave 3: Evaluering og justering

Etter at modellen er trent, må du evaluere ytelsen på et testsett. Hvis ytelsen ikke er tilfredsstillende, kan du justere modellarkitekturen, hyperparametrene eller treningsdataene og gjenta treningen.

Task 3: Evaluatin’ and Adjustin’

After the model’s been trained, ya gotta evaluate how it performs on a test set. If the performance ain’t satisfactory, ya can adjust the model architecture, hyperparameters, or trainin’ data and repeat the trainin’.

Programmin’ languages?

Like afore, feel free to use any programmin’ language ya want! The examples here’ll be in Python.

Medium Task 1.1 - Caesar Cipher Encryption

Now we’re gonna actually write some code! We’ll start simple by makin’ the encryption, based on the theory it oughta be pretty straightforward.

Implement encryption with the Caesar Cipher by usin’ a function that takes in text and a number that’s the “key”, meanin’ how much the alphabet’s gonna be rotated.

Tips til framgangsmåte.
  1. Make a function called caesar that takes the text to be encrypted and a “shift”, that is, how many places in the alphabet the text should be shifted.
  2. Go through letter by letter in the text.
  3. We don’t want to “shift” characters other than letters: figure out how to check if a character in the text is a letter.
  4. We need to “rotate” the letter by n places, that is, we need to add the rotation: figure out how to convert the text to numbers so you can add the shift. Hint: the ord() function.
  5. Remember! You get different values here based on whether you have lowercase or uppercase letters. Refer to ASCII Table.
  6. After you have a value, it’s as simple as adding the n value. But what happens if you’re at the end of the alphabet? We just get nonsense after the letter Z. How do you fix this? This requires some thinking.
Fikse enkrypteringen.

For å fikse krypteringen helt krever litt tenking.

  • The first step to think about is using the modulus operator, %.
  • Since the alphabet (in English) consists of 26 letters, we can take the modulus with 26.
  • But this doesn’t quite work, do you see the reason?
  • Try to print out the value of a character with ord(), what do you get?
  • For a you get 97. If you take the modulus 26 with this, you get 19. Remember that modulus will always give an answer between 0 and the number.
  • This can be fixed by storing the starting value for uppercase and lowercase letters, subtracting this from the letter, and then taking the modulus. It then becomes: (ord(letter) - ord('a')) % 26
  • To get the correct letter back, just add the starting value again.
  1. After all this, you can finally convert the number back to a letter. Here you can use the chr() function.
  2. Now you can finally add the letter to a result and return the encrypted text!

Løsning:
def caesar_cipher(text, shift):
    result = ""
    for char in text:
        if char.isalpha():
        # figure out the startin' point based on uppercase and lowercase
        start = ord('A') if char.isupper() else ord('a')
        # The tricky shift calculation
        result += chr((ord(char) - start + shift) % 26 + start)
    else:
        result += char
    return result

Easy Task 1.2 - Caesar Cipher Decryption

Decryption is just doin’ the opposite calculation to encryption. Ya subtract the offset instead of addin’ it.

Tips til framgangsmåte.

Use the function you made in task 1 for this here. Just take the same function, but in reverse. You can do this by shiftin’ it around by 26 - shift.

Løsning:
def caesar_decrypt(text, shift):
    return caesar_cipher(text, 26 - shift)

Other Monoalphabetic Ciphers (ex. Atbash)

There’s other monoalphabetic ciphers out there too! One o’ the simpler ones is the one called the “Atbash” Cipher.

How Does Atbash Work?

This here’s mighty simple, instead of a rotation, letters are mapped to the opposite alphabet. Here’s a table showin’ the mappin’:

a b c d e f g h i j k l m n o p q r s t u v w x y z
z y x w v u t s r q p o n m l k j i h g f e d c b a

Medium Task 1.3 - Atbash Cipher Encryption and Decryption

The neat thing ‘bout Atbash is that seein’ as how encryption’s a one-to-one transformation, it works directly in reverse. That is to say, if ya made the encryption, ya’ve also, automatically made the decryption.

How Can This Be Done in Practice?

You can either subtract the letter in relation to Z, or make a “Look-up” table. That is, it means a table or dictionary that contains all the letters from a to z and what they should become. This can be a good solution if y’all want to make another type of encryption.

Lookup-table implementasjon.
letters = {
    'a': 'z'
    'b': 'y'
    'c': 'x'
    'd': 'w'
    # ... add the rest of the letters down here
}

Usin’ this here table, ya can go through letter by letter, then fetch the value for each letter from the lookup table, and write it out. What kinda things ya gotta do for big letters and little letters?


Part 2 - Crackin’ Monoalphabetic Ciphers

In this here section, you’re gonna try and build an algorithm to “crack” a Caesar cipher, meanin’ takin’ a coded text and gettin’ back the original text without knowin’ the key.

This can be done kinda by hand, or you can try usin’ some simple “cryptanalysis.” This here’s a concept we’ll be lookin’ at deeper later on, but for now, we’re just gonna look at one of the simplest ways: Frequency Analysis. You can read more ‘bout this here concept here: Frequency Analysis or here Wikipedia - frequency analysis.

This here method can be used in more than just Caesar ciphers, it can be used in more complicated algorithms too, but Caesar ciphers are so simple that frequency analysis is a cinch.

How does frequency analysis work?

Frequency analysis is, as the name suggests, a way to check the frequency of letters in a text. Why might this be useful? Imagine you have a long text, let’s imagine an English text, taken from Wikipedia - frequency analysis:

In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.

Frequency analysis is based on the fact that, in any given stretch of written language, certain letters and combinations of letters occur with varying frequencies. Moreover, there is a characteristic distribution of letters that is roughly the same for almost all samples of that language. For instance, given a section of English language, E, T, A and O are the most common, while Z, Q, X and J are rare. Likewise, TH, ER, ON, and AN are the most common pairs of letters termed bigrams or digraphs), and SS, EE, TT, and FF are the most common repeats. The nonsense phrase ETAOIN SHRDLU represents the 12 most frequent letters in typical English language text.

In some ciphers, such properties of the natural language plaintext are preserved in the ciphertext, and these patterns have the potential to be exploited in a ciphertext-only attack.

If we take this text and transform it using a Caesar cipher (also removing commas, spaces, and other special characters), we get the following ciphertext:

xcrgneipcpanhxhugtfjtcrnpcpanhxhxhiwthijsnduiwtugtfjtcrnduatiitghdgvgdjehduatiitghxcprxewtgitmiiwtbtiwdsxhjhtsphpcpxsidqgtpzxcvraphhxrparxewtghugtfjtcrnpcpanhxhxhqphtsdciwtupriiwpixcpcnvxktchigtirwdulgxiitcapcvjpvtrtgipxcatiitghpcsrdbqxcpixdchduatiitghdrrjglxiwkpgnxcvugtfjtcrxthbdgtdktgiwtgtxhprwpgpritgxhixrsxhigxqjixdcduatiitghiwpixhgdjvwaniwthpbtudgpabdhipaahpbeathduiwpiapcvjpvtudgxchipcrtvxktcphtrixdcdutcvaxhwapcvjpvttippcsdpgtiwtbdhirdbbdclwxatofmpcsypgtgpgtaxztlxhtiwtgdcpcspcpgtiwtbdhirdbbdcepxghduatiitghitgbtsqxvgpbhdgsxvgpewhpcshhttiipcsuupgtiwtbdhirdbbdcgtetpihiwtcdchtchtewgphttipdxchwgsajgtegthtcihiwtbdhiugtfjtciatiitghxcinexrpatcvaxhwapcvjpvtitmixchdbtrxewtghhjrwegdetgixthduiwtcpijgpaapcvjpvteapxcitmipgtegthtgktsxciwtrxewtgitmipcsiwthtepiitgchwpktiwteditcixpaidqttmeadxitsxcprxewtgitmidcanpiiprz

This text looks impossible to “crack”, but with “Frequency Analysis” it’s not only possible, but easy.

Look at the following figure:
English frequency distribution

This is a figure showing the distribution of letters in English. What we can see is that the letter E is the most frequent letter, followed by T, A, and O.

This can be converted into a table and then used to count and analyze a given ciphertext in order to “crack” it. In the tasks below, you will create a program that can “crack” the Caesar cipher on its own. It is true that the Caesar cipher is so simple that you can just check all 26 possibilities manually, but here we will find the solution, completely automatically.

Easy Task 2.1 - Makin’ a Frequency Table

In a Python file, build a frequency table o’ the letters in the English language. Ya can try and find this one yerself, but if ya don’t feel like it, we understand that!

If ya absolutely wanna find it yerself, ya can do like in Task 2.2, but on a real big piece o’ text.

English Letter Frequency (Svaret)
english_letter_frequency = {
    'E': 12.70, 
    'T': 9.06, 
    'A': 8.17, 
    'O': 7.51, 
    'I': 6.97, 
    'N': 6.75, 
    'S': 6.33, 
    'H': 6.09, 
    'R': 5.99, 
    'D': 4.25, 
    'L': 4.03, 
    'C': 2.78, 
    'U': 2.76, 
    'M': 2.41, 
    'W': 2.36, 
    'F': 2.23, 
    'G': 2.02, 
    'Y': 1.97, 
    'P': 1.93, 
    'B': 1.29, 
    'V': 0.98, 
    'K': 0.77, 
    'J': 0.15, 
    'X': 0.15, 
    'Q': 0.10, 
    'Z': 0.07
}

Medium Task 2.2 - Countin’ the Frequency o’ Letters in Text

Now, we’re gonna build an algorithm that finds the frequency o’ letters in a given text.

Tips til framgangsmåte
  1. Start with a function that takes in a text (can be anything).
  2. In the function, create a “dictionary” (Python Dictionaries), with entries for each letter of the alphabet, set to 0. ({'A' = 0, 'B' = 0, 'C' = 0, ..., 'Z' = 0})
  3. Go through the entire text and count each letter (increase by 1 in the corresponding entry in the dictionary). Here you should probably ignore characters that are not letters, also remember uppercase and lowercase.
  4. Keep track of how many letters have been counted in total.
  5. When you are finished counting, divide / each value in the table by the length of the text and then multiply by 100, this will give you a percentage frequency. (You can of course also let your table be between 0 and 1).
  6. Now you should have a frequency table for the text.

Medium Task 2.3 - Comparin’ the frequency of a text with the real frequency

Now that ya found the frequency of all the letters in the text, ya can make a function that finds the “distance.” Huh? What’s meant by that?!

Ya can reckon the frequency of, say, E in the text is gonna be a number. Ya can find the “distance” this has with the actual frequency, which is 12.70. Example: The frequency is 9.63, what’s the distance? The distance is gonna be the absolute value (negative numbers become positive) between these two values: \(12.70 - 9.63 = 3.07\).

Make a function that goes through each of the letters and finds the distance. Then add all the distances together to a “total” distance.

Math function?

If you’re wonderin’ how the math function for this here thing works, it looks like this:

\(\sum_{n=0}^{N} \lvert a - b\rvert\)

Tips til framgangsmåte
  1. Use a for-loop to go through the whole frequency table.
  2. For each letter in the frequency table, find the absolute value compared to the actual frequency. Use the abs() function in Python for this.
  3. Add all the values together, and you’ll get a final result.

Hard Task 2.4 - “Crackin’” the Caesar Cipher

Now, we’re gonna put all we’ve done so far together! We’re gonna “crack” a Caesar cipher.

Make a program that “cracks” a Caesar cipher! Without any help from the user, ya gotta be able to throw in some encrypted text and get the decrypted text without needin’ a key.

Test data

Here’s some test data y’all can use, what do these say?

Test-data
cqrbvnbbjpnrbjenahbnlancxwnqxynoduuhhxdjanjkuncxmnlxmnrclxvyuncnuhjwmqnanjanbxvnfxamboaxvxdaojexarcnsnmrqnuuxcqnanrcbxenajwjtrwrqjencqnqrpqpaxdwmhxdfnanarpqccqnwnpxcrjcrxwbfnanbqxac
lsaizivxlmwqiwwekimwuymxiwlsvxwsmxqmklxrsxasvoewibtigxihlsaizivmjmxhsiwksshnsf
bmtxymjwjsfdfsxbjwrjxyfsifsizsktqidtzwxjqkqtslqnajymjpnslgfwsfwitmjdtzhtrjrtxyhfwjkzqqdzutsdtzwmtzwynxstbxywzhpybjqajljyymjjytgjikwfshnxhtktwymnxwjqnjkrzhmymfspxynxgnyyjwhtqifsinfrxnhpfymjfwymfajdtzmfivznjylzfwistyfrtzxjxynwwnslbjqqlttisnlmynkdtzitrjjymtwfyntfsirfwhjqqzxymjwnafqxtkrdbfyhmgniymjrrfpjmfxyj
zwkyvivrivrepzuzfkjzekyviffdnzcckyvpgcvrjvjkreulgjrzukyvjritrjkztkvrtyvirwkvircfexjzcvetvfevwivjydreifjvkfyzjwvvkefnkyvedzjkvinypufpfltfejzuvipflijvcwrezuzfkzehlzivukyvkvrtyvinzkyrjevvinvccrtklrccpzufekjrzukyvjkluvekslkzyrkvkfjvvpfljkreuzexlgkyvivrccsppflijvcw
uwwilxchaniuffehiqhfuqmizupcuncihnbylycmhiqusuvyymbiofxvyuvfynizfscnmqchamulyniimguffniayncnmzunfcnnfyvixsizznbyaliohxnbyvyyizwiolmyzfcymuhsqusvywuomyvyymxihnwulyqbunboguhmnbchecmcgjimmcvfysyffiqvfuwesyffiqvfuwesyffiqvfuwesyffiqvfuweiibvfuweuhxsyffiqfynmmbueycnojufcnnfyvullsvlyuezumncmlyuxswigcha

Tips til framgangsmåte
  1. Start by makin’ a function that takes in text.
  2. Use the decryption function for the Caesar cipher with rotation N on the text, N starts at 0.
  3. Make a frequency table of the result.
  4. Figure out the distance to the result in relation to the actual frequency table.
  5. Either: a) keep track of the distance in a list, or b) keep track of the smallest value and the rotation (this here’s yer key).
  6. Increase the rotation by 1 and repeat steps 2 to 6 until N reaches 26 (full rotation).
  7. Return the decrypted text, meanin’ the smallest distance is the right key.