Breaking

On the insecurity of math.random and it’s siblings

During code reviews we often see developers using weak RNGs like math.random() to generate cryptographic secrets. We think it is commonly known that weak random number generators (RNG) must not be used for any kind of secret and recommend using secure alternatives. I explicitly did not state a specific language yet, because basically every language offers both weak and strong RNGs.

So I asked myself: What if I use a weak RNG to generate a secret? Is it possible to recover the secret from some derived value, like a hash?

Some theory on RNG

Before going into the practical part, I’d like to give you some theoretical background. The basic problem with random number generation is the fact that computers are deterministic. That means a program always produces the same output for some input and internal state. I know, it doesn’t always feel like that 😉

So we have to find a way to generate random looking numbers (or bitstreams on the lower level). One of the oldest but insecure RNGs are linear congruential generators. For the human eye it seems they produce random numbers, but it is easy to recalculate the internal state with some output and math. And by knowing the internal state, an attacker is able to know any future output.

And this leads us to the basic goal of an attacker: be able to guess or predict the output of a RNG. And unfortunately most “default” RNGs are based on linear congruential generators, like Java’s java.util.Random.

Internals of java.util.Random

As said, java.util.Random is based on the linear congruential generator construct and thus is bad enough by itself. Still one could argue that by properly seeding the RNG and only producing a few bytes it is not possible to recover the internal state. For RNGs a so called seed is a value to initialize the internal state before even generating the first number.

So let’s look what Java’s implementation is doing:

Refering to OpenJDK’s Implementation (jdk8u60) of java.util.Random one finds two constructors: one with a seed parameter, the other without. The interesting question is: What happens when no seed is provided? Is there a default value? Yes and no. Java uses an internal value called seedUniquifier which is changed everytime the constructor is called. As it is a class variable, every time the JVM is (re)started the seedUniquifier starts with the same value. To get the next value, the current value of seedUniquifier is multiplied by a constant number. So one can easily guess the values produced here. Luckily this is not everything here. Java also gets System.nanoTime() which then is XORed with the seedUniquifier.

So if one can guess the time of the constructor call and thus the value of System.nanoTime(), it should be easy to guess the final seed, right? Well, not really. There easily is a misconception about System.nanoTime(). It is not the current realtime as represented by a Unix timestamp. It is the time difference to some arbitrary point in time. By definition this can be any point in time, even in the future and is provided by the kernel. This seems strange at first, but the intention here is to use System.nanoTime() for measuring times, e.g. of function calls. Still I wanted to check how this practically looks like. Is that arbitrary point in time really that unclear? So I fired up JDK and called System.nanoTime(). And it looked like the system startup time was used as “zero”. This was also the case when launching a VM. So even if this is always the case (which is not, by definition) you have to guess the system startup time, e.g. of a webserver. At first this looks like much, but a year only has about 250 nanoseconds, which is a piece of cake for a cracking cluster (more on that later).

Still, we can do better: As said Random() is based on a linear congruential generator. In terms of implementation this reduces essentially to the following:

private static final long multiplier = 0x5DEECE66DL;
private static final long addend = 0xBL;
private static final long mask = (1L << 48) - 1;

protected int next(int bits) {
    seed = (seed * multiplier + addend) & mask;
    return (int)(seed >>> (48 - bits));
}

private static long initialScramble(long seed) {
    return (seed ^ multiplier) & mask;
}

initialScramble is called after the constructor is called and before any numbers are generated. next() is used for number generation. The code above is not complete, but the main points are there. And one is the mask, which essentially is 48 times one. So when calculating the next seed or doing the initial scramble, the leftmost 16 bits are always zeroed. This leaves an internal state of only 48 bits which again is easy to bruteforce.

Despite the fact, that nanoTime() and thus the internal state of the RNG is guessable, one can completely skip that part and directly guess the 48 bit internal state. This also means, that even a proper seed won’t help here.

Attacking the RNG

Let’s see how we can put this to good use. At first assume java.util.Random is used to generate some secret value. This can be done by producing a 128 bit key with next(). If we guess the internal state correctly, we easily can produce the same 128 bits. But to see if we guessed that value correctly, we need a way to verify our guess.

There are a few scenarios where this is applicable:

  • A HMAC is calculated and both the MAC and the message is known.
  • The key is used for encryption and the encrypted message is known. Also at least a part of the plaintext must be known.

In the first case, we can recalculate the HMAC for the message and see if our result matches the known MAC. In the latter, we can decrypt the message with our guessed key and check if we can find the known plaintext part. Both situations are easily found in practice. The first is essentially a digital signature based on a pre-shared key and the latter is default encryption. Also in most cases it is easy to guess some parts of the plaintext or at least see if the decrypted value makes sense. Simply due to the fact that we’re talking protocols, we know how messages may look like.

To perform an attack in the HMAC-scenario, we need to reimplement the mentioned core functionality of Java’s java.util.Random() and bruteforce the internal state producing 128bit output. Afterwards we can calculate the HMAC to see if our guess was correct. As every guess is independent this can be done in parallel. Like on a few hundred processors a GPU cracking cluster can provide. Depending on the size of the cluster and the used GPUs the attack can be performed within a few hours. Also as many cloud providers already provide virtual machines with GPUs attached, the costs can be as low as 100-200$.

Conclusion

While we were looking at Java’s implementation for the attack, weak RNGs impose a common issue. Having a too small internal state is just one issue and past vulnerabilities in cryptography often start with small issues, like leaking single bits or bytes. It should be assumed, that this also works for RNG.

On the other hand mitigation is easy: most programming languages offer cryptographically secure random number generators (CSRNG) and/or an interface to get random bits from the operating system. In the case of Java this would be java.util.SecureRandom.

Stay save and use proper CSRNGs,
Benjamin