Using Clojure for the Cryptopals cryptography challenges
I've been going through the Cryptopals challenges recently. For those who don't know, they're a series of exercises based on real-life cryptography breaks, and they're lots of fun. You can see my code here.
Clojure is one of my favorite languages, so I decided to tackle them with it. Here are some thoughts on the first 3 sets, plus a few issues I ran into along the way. I hope this helps anyone else going through them without giving too much away.
General challenge notes
Challenge 3 - Single-byte XOR cipher
When doing frequency analysis, the instructions make no mention of the chi-squared statistic, though that's probably a good starting point, and generalizes beyond alphabetic frequencies to any expected distribution. You can use it later when detecting ECB vs CBC, since you can use it to detect deviations from uniformity.
Challenge 7 - AES in ECB mode
The instructions do mention this, but after chatting with someone else doing the challenges, realize that you do not need to write AES yourself. Just use a default implementation for your language. You only need the single block transform (ECB), which you'll use to build on top of. For Java, all you need is javax.crypto.Cipher and SecretKeySpec, using "AES/ECB/NoPadding".
Challenge 9 - Implement PKCS#7 padding
When implementing PKCS#7 padding, the instructions make no mention of what to do when your data lines up perfectly on a block boundary. Only later, when doing the padding oracle challenge, did I realize you have to add an entire padding block (sixteen 16's) in that case.
Challenge 17 - The CBC padding oracle
When altering the first byte of a block to see if it produces valid padding, you're aiming to alter it such that the padding ends in a single 1 byte. But remember, if you test the original, unaltered byte, you will also get valid PKCS#7 padding (since it was correct to begin with), which is probably not 1.
Challenge 18 - Implement CTR, the stream cipher mode
Look carefully at the examples. At first glance it looks as if you're XORing a bunch of zeros, but the ninth byte is actually incrementing.
Challenge 23 - Clone an MT19937 RNG from its output
Inverting the Mersenne Twister's temper function has a computationally simple version. It's not necessary to use brute force, inverse matrix multiplication, or a constraint solver. It can be done with just bit operations. The key insight is that XORing an integer with a shifted version of itself results in many of the bits in the output being identical (since the shifted-in bits are 0). You can then shift those bits and XOR again to obtain more bits, repeating until you have the whole thing.
Clojure / Java-specific issues
Poor primitive support in JVM and Clojure
The bread and butter of cryptography is byte-level manipulation. Unfortunately, the JVM treats bytes as second-class citizens (e.g., many methods on Integer/Long have no Byte counterpart), and lacks support for unsigned integer types. On top of this, Clojure defaults to 64-bit longs everywhere, necessitating a lot of conversion. I ended up writing many conversion routines just for basic support, and leaving integers in the default long type for compatibility.
The single biggest thing to watch for was when I got deep into bit-shifting code while building the Mersenne Twister random number generator. When everything is a long, unsigned right bit shifts will not do what you expect. The 32-bit version of the Mersenne Twister (MT) relies on unsigned 32-bit ints. But unsigned shifting adds zeros at the end, which, if you're using a larger data type, will result in the zeros entering above the bits of interest. E.g., consider the case where all bits are set to 1 (in two's complement, this is -1), and you do an unsigned shift right by 16 bits. You want:
00000000000000001111111111111111
but you get:
0000000000000000111111111111111111111111111111111111111111111111
Since you only want the right-most 32 bits, it appears as if shifting had no effect!
Final Thoughts
- Some of these ciphers are so broken, I'm amazed they were ever used. Apparently, the simple substitution cipher (replace each letter with the letter n higher) was in use as late as 1915 by the Russian army, despite frequency analysis having been described a millennia (!) earlier.
- Almost everything decrypts to Vanilla Ice lyrics. Unfortunately, there's probably nothing hidden in them.