InfoSecurity India's First Magazine on Comprehensive IT Security
Menu Bar
InfoSecurity February 2009
Tech Focus

Elements of Pseudo-Random Number Generators

Although the term Pseudo-Random Number Generator (PRNG) is appropriate and is used in conjunction with IT Security, the phrase "random numbers" can be misleading to many of us. The reality is security-critical applications require “random” numbers. In this article, we will discuss the elements of PRNGs and CPRNGs in concurrence with computer security in the real-world secure systems.

While we all in general presume that—what Random number generators has to do with security—the reality is that Random Number Generation is an essential part of any secure application. To many people, it suggests random number generator functions in the math libraries which come with one's compiler. Such random generator functions are insecure and have to be avoided for strong cryptographic purposes.

What one needs for cryptography are values which cannot be guessed by an adversary any more easily than by trying all possibilities (i.e., through brute force). As a matter of fact, there are several ways to acquire or generate such values, but none of them is guaranteed. Therefore, selection of a random number source in general term is a matter of art and assumptions.

Why we need Random Numbers?

Security-critical applications require “random” numbers. Unfortunately, there is no consensus on the best way of obtaining such random numbers. Moreover, there is not a consistent set of requirements or terminology between different solutions.

Ultimately, we would like generators capable of giving “true” random numbers, where all 2n possible values for an n bit datum are equally likely from the point of view of an adversary. i.e., we would like a random number generator to provide information theoretic security, where there is one bit of entropy per bit of generator output.

In practice, that goal often isn’t feasible due to the difficulty of harvesting high-entropy data. Instead, it is common to take data that is believed to contain enough entropy for cryptographic security, and use it to “seed” a cryptographic pseudo-random number generator, which is an algorithm that produces a stream of values that one hopes, for all intents and purposes, is indistinguishable from random to any attacker in practice, even if a theoretical attack were possible, given unbounded resources.

What is PRNG?

A pseudorandom number generator (PRNG) is an algorithm for generating a sequence of numbers that approximates the properties of random numbers. The sequence is not truly random in that it is completely determined by a relatively small set of initial values, called the PRNGs state. Although sequences that are closer to truly random can be generated using hardware random number generators.

Most pseudo-random generator algorithms produce sequences which are uniformly distributed by any of several tests. Common classes of these algorithms are linear congruential generators, lagged Fibonacci generators, linear feedback shift registers and generalized feedback shift registers. Recent instances of pseudo-random algorithms include Blum Blum Shub, Fortuna, and the Mersenne twister.

Interfaces to Random Numbers

People tend to use pseudo-random numbers instead of numbers that are secure in the information theoretic sense (i.e., truly random numbers with entropy to an attacker) only because pseudo-random numbers are efficient to generate, whereas data with large amounts of entropy tends to not be. One advantage of using truly random numbers is that the difficulty of breaking one random value (by, say, brute-force guessing) is independent of breaking any other value. That is not true with a cryptographic PRNG, where an attacker with enough output can predict future outputs with a probability of 1, assuming enough computational resources. A good PRNG will provide computational security, meaning that an adversary with realistic resources should not be able to distinguish PRNG output from truly random data. However, PRNGs are incapable of providing information theoretic security in the way that a one-time pad can.

We believe that the gap between information theoretic security and computational security argues that systems should not only provide an interface to cryptographically secure pseudo-random numbers, they should also provide an interface to numbers that are believed to be truly random. Even though most applications will not be willing to expend significant resources to get truly random numbers, there are occasions where it can be useful.

For example, there may be some high security data where one would like to ensure that the compromise of one piece of data does not lead to the compromise of other pieces of data, such as long-term, high security keys.

Another consideration is that a system may wish to have more security than a pseudo-random number generator has the strength to provide. For example, the PRNGs provided by most operating systems use cryptography that is based on a 128-bit symmetric cipher or a 160-bit hash function, and thus can never have more than 160 bits of security at any given time. But, one may wish to generate a 256-bit symmetric key, or a public key pair where 256-bit equivalent security is desired.

Note that using two 128-bit PRNGs that are independently seeded does not solve the problem, only doubling an attackers workload, increasing security to 129 bits instead of the desired 256.

Many systems intertwine entropy harvesting and pseudo-random number generation. Those that do not still are generally only concerned with providing a stream of pseudo-random numbers. However, we believe that a good randomness infrastructure should provide at least two interfaces, one that is fast and cryptographically strong, and one where data that is hopefully truly random, with one bit of entropy per bit of output.

There are other types of interfaces that one may wish to provide. Particularly, we see two other useful concepts. First, one may wish to have truly random data if it is available, but still be willing to fall back on pseudo-random data to fill in the rest.

Second, there are times where the state of a pseudorandom number generator need not be secret - it is only important that the outputs be strongly collision resistant. Particularly, the attacker who knows the algorithm for generating outputs should not be able to take a different internal state and be able to produce the same output. This is the case when choosing session IDs for TCP/IP, and can be the case when choosing public object identifiers in systems like COM. A cryptographic PRNG will do for these purposes, except when there is not enough entropy available. We prefer to look for other solutions to this problem, as such an API is likely to be misused.

Significance

Careful mathematical analysis is required to have any confidence a PRNG generates numbers that are sufficiently "random" to suit the intended use. Robert R. Coveyou of Oak Ridge National Laboratory once titled an article, "The generation of random numbers is too important to be left to chance." As John von Neumann joked, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."

Because of its mathematical importance, the pseudo-random numbers are important and are central in the practice of cryptography.

Periodicity

A PRNG can be started from an arbitrary starting state, using a seed state. It will always produce the same sequence thereafter when initialized with that state. The maximum length of the sequence before it begins to repeat is determined by the size of the state, measured in bits. However, since the length of the maximum period potentially doubles with each bit of 'state' added, it is easy to build PRNGs with periods long enough for any practical application.

If a PRNG's internal state contains n bits, its period can be no longer than 2n results. For some PRNGs the period length can be calculated without walking through the whole period. Linear Feedback Shift Registers (LFSRs) are usually chosen to have periods of exactly 2n-1. Linear congruential generators have periods that can be calculated by factoring. Mixes (no restrictions) have periods of about 2n/2 on average, usually after walking through a non-repeating starting sequence. Mixes that are reversible (permutations) have periods of about 2n-1 on average, and the period will always include the original internal state. Although PRNGs will repeat their results after they reach the end of their period, a repeated result does not imply that the end of the period has been reached.

Cryptographically Secure PRNG

A PRNG suitable for cryptographic applications is called a Cryptographically Secure PRNG (CSPRNG). The difference between a PRNG and a CSPRNG is not simple: a CSPRNG must meet certain design principles and be resistant to known attacks. Years of review are required before such an algorithm can be certified and it is still possible attacks will be discovered in the future.

Some classes of CSPRNGs include the following:

  • Stream ciphers.

  • block ciphers running in counter or output feedback mode.

  • Special designs with a security proof. For example Blum Blum Shub has a strong security proof, although the proof is conditional and requires that some internal values remain secret from any attacker. Moreover, Blum Blum Shub is too slow for many applications.

  • PRNGs that have been designed specifically to be cryptographically secure, such as Microsoft's Cryptographic Application Programming Interface function CryptGenRandom, the Yarrow algorithm (incorporated in Mac OS X and FreeBSD), and Fortuna.

  • Combination PRNGs which attempt to combine several PRNG primitive algorithms with the goal of removing any non-randomness.

Applications of CSPRNG

A cryptographically secure pseudo-random number generator (CSPRNG) is a pseudo-random number generator (PRNG) with properties that make it suitable for use in cryptography.

Many aspects of cryptography require random numbers, some of them are:

  • Key generation.

  • Nonces.

  • Salts in certain signature schemes, including ECDSA, RSASSA-PSS.

  • One-time pads.

As the above stated processes are just some of the components of security which requires CSPRNG, we will not go into the details of each of them.

The “quality” of the randomness required for these applications varies. For example creating a nonce in some protocols needs only uniqueness. On the other hand, generation of a master key requires a higher quality, such as more entropy. And in the case of one-time pads, the information-theoretic guarantee of perfect secrecy only holds if the key material is obtained from a true random source with high entropy.

Ideally, the generation of random numbers in CSPRNGs uses entropy obtained from a high quality source, which might be a hardware random number generator or perhaps unpredictable system processes. Though unexpected correlations have been found in several such ostensibly independent processes, from an information theoretic point of view, the amount of randomness, the entropy that can be generated is equal to the entropy provided by the system. But sometimes, in practical situations, more random numbers are needed than there is entropy available. Also the processes to extract randomness from a running system are slow in actual practice. In such instances, a CSPRNG can sometimes be used. A CSPRNG can "stretch" the available entropy over more bits.

When all the entropy we have is available before algorithm execution begins, we really have a stream cipher. However some crypto system designs allow for the addition of entropy during execution, in which case it is not a stream cipher equivalent and cannot be used as one. Stream cipher and CSPRNG design is thus closely related

Box:-

Requirements of CSPRNG

The requirements of an ordinary PRNG are also satisfied by a cryptographically secure PRNG, but the reverse is not true. CSPRNG requirements fall into two groups: first, that their statistical properties are good (passing statistical randomness tests); and secondly, that they hold up well under serious attack, even when part of their initial or running state becomes available to an attacker.

  • Every CSPRNG should satisfy the "next-bit test". The next-bit test is as follows: Given the first k bits of a random sequence, there is no polynomial-time algorithm that can predict the (k+1)th bit with probability of success higher than 50%. Andrew Yao, a prominent computer scientist and computational theorist proved in 1982 that a generator passing the next-bit test will pass all other polynomial-time statistical tests for randomness.

  • Every CSPRNG should withstand 'state compromise extensions'. In the event that part or all of its state has been revealed (or guessed correctly), it should be impossible to reconstruct the stream of random numbers prior to the revelation. Additionally, if there is an entropy input while running, it should be infeasible to use knowledge of the input's state to predict future conditions of the CSPRNG state.

Example: If the CSPRNG under consideration produces output by computing bits of "Pi" in sequence, starting from some unknown point in the binary expansion, it may well satisfy the next-bit test and thus be statistically random, as "Pi" appears to be a random sequence. (This would be guaranteed if pi is a normal number, for example.) However, this algorithm is not cryptographically secure; an attacker who determines which bit of pi (ie the state of the algorithm) is currently in use will be able to calculate all preceding bits as well.

Most PRNGs are not suitable for use as CSPRNGs and will fail on both counts. First, while most PRNGs outputs appear random to assorted statistical tests, they do not resist determined reverse engineering. Specialized statistical tests may be found specially tuned to such a PRNG that shows the random numbers not to be truly random. Second, for most PRNGs, when their state has been revealed, all past random numbers can be retrodicted, allowing an attacker to read all past messages, as well as future ones.

CSPRNGs are designed explicitly to resist this type of cryptanalysis.

CSPRNG Designs:

In the discussion below, CSPRNG designs are divided into three classes:

  1. Those based on cryptographic primitives such as ciphers and cryptographic hashes,

  2. Those based upon mathematical problems thought to be hard, and

  3. Special-purpose designs. The last often introduce additional entropy when available and, strictly speaking, are not "pure" pseudorandom number generators, as their output is not completely determined by their initial state. This addition can prevent attacks even if the initial state is compromised.

Designs based on cryptographic primitives:

  • A secure block cipher can be converted into a CSPRNG by running it in counter mode. This is done by choosing a random key and encrypting a zero, then encrypting a 1, then encrypting a 2, etc. The counter can also be started at an arbitrary number other than zero. Obviously, the period will be 2n for an n-bit block cipher; equally obviously, the initial values (ie, key and "plaintext") must not become known to an attacker lest, however good this CSPRNG construction might be otherwise, all security be lost.

  • A cryptographically secure hash of a counter might also act as a good CSPRNG in some cases. In this case also, it is necessary that the initial value of this counter is random and secret. If the counter is a bignum, then the CSPRNG could have an infinite period. However, there has been little study of these algorithms for use in this manner, and at least some authors warn against this use (Young and Yung, Malicious Cryptography, Wiley, 2004, sect 3.2).

  • Most stream ciphers work by generating a pseudorandom stream of bits that are combined (almost always XORed) with the plaintext; running the cipher on a counter will return a new pseudorandom stream, possibly with a longer period. The cipher is only secure if the original stream is a good CSPRNG (this is not always the case: see RC4 cipher). Again, the initial state must be kept secret.

Number theoretic designs:

Blum Blum Shub has a strong, though conditional, security proof, based on the presumed difficulty of integer factorization. However, implementations are slow compared to some other designs.

Special designs:

There are a number of practical PRNGs that have been designed to be cryptographically secure, including:

  • The Yarrow algorithm which attempts to evaluate the entropic quality of its inputs, and an updated version, Fortuna, which does not. Yarrow is used in OpenBSD and Mac OS X.

  • The UNIX special file /dev/random, particularly the /dev/urandom variant as implemented on Linux.

  • The function CryptGenRandom provided in Microsoft's Cryptographic Application Programming Interface.

  • The Python function urandom in the os module, which uses /dev/urandom on Unix-based systems, including OS X, and CryptGenRandom on Windows-based systems.

  • ISAAC based on a variant of the RC4 cipher.

  • ANSI X9.17 standard (Financial Institution Key Management (wholesale)), which has been adopted as a FIPS standard as well. It takes as input a 64 bit random seed s, and a DES key k.

—By:R. Manoj. The author is an Assistant Editor at Fanatic Media, Bangalore.


Home   |   Current Issue   |   Archives   |   Subscription   |   Advertisement   |   Contacts

© 2006-07 'InfoSecurity' magazine. All rights reserved.
Website designed, developed and maintained by Fanatic Media