Randomness Tests
There are a number of ways to test the fitness of Pseudo-Random Number Generators, and several popular suites like the Diehard tests or the TestU01 library have been developed for this express purpose. Typically, these tools examine the output of a Pseudo-Random Number Generator in various ways, ultimately returning values that can be used to determine how well the PRNG faired.
But perhaps we can do better.
Since our study involves re-implementing each PRNG as well as testing them, we can fine-tune our test suite to match the PRNGs directly. In fact, we can even examine the internal state of every PRNG being tested - something none of the other tools have attempted.
Below is a summary of each test we'll be using, along with some additional information about how these tests are conducted.
But perhaps we can do better.
Since our study involves re-implementing each PRNG as well as testing them, we can fine-tune our test suite to match the PRNGs directly. In fact, we can even examine the internal state of every PRNG being tested - something none of the other tools have attempted.
Below is a summary of each test we'll be using, along with some additional information about how these tests are conducted.
Test Seeds
As PRNGs are made up of a fixed sequence of instructions, they will always produce the same series of numbers. The first number - used to initialize the PRNG - is known as the Seed. By controlling the Seeds that are used, we can ensure that each test uses the same sequence of "random" numbers.
In real-world applications, programmers often use some outside source, such as the computer's hardware clock, to provide an unpredictable Seed.
Unless otherwise stated, our tests use each of the Seeds in the following table:
In real-world applications, programmers often use some outside source, such as the computer's hardware clock, to provide an unpredictable Seed.
Unless otherwise stated, our tests use each of the Seeds in the following table:
SEED | NOTES |
1138 | This number appears in Star Wars films and other works by George Lucas as an easter egg. The number itself is a reference to THX 1138, one of his early works. |
65535 | This is the largest value that can be held by a 16-bit unsigned integer. Most PRNGs that we will test use 32-bit internal states, making this a good cut off. |
8675309 | Written by Alex Call and Jim Keller, 8675309/Jenny was a popular song in the 1980s. The number itself is her phone number. |
16777216 | Much like 65535, this is the upper limit for 24-bit unsigned integers. It's also the maximum number of colors available under the True Color graphics system. |
123456789 | A nine-digit number consisting of the first nine digits. This is also large enough that some PRNGs may need to adjust the number in order to use it. |
Test Suite
This section describes the different tests used by our custom test suite.
Count the 1s
While the actual numbers produced by a PRNG can appear random, it's possible that there is a bias in favor of 1s or 0s in the output's binary values. To check for this, 100,000 bits (not results) are generated and the 1s in the output are counted. As each PRNG has different output sizes, the exact number of calls to the PRNG will vary, but the number of 1s should still be close to 50% of the total output.
Crush Test
This test uses the famous zlib library to see how redundant the PRNG's output is. Compression schemes like those in this library work by finding repeated information and replacing it with smaller symbols. In theory, random data shouldn't have enough repetition, and in turn, files created using random data shouldn't compress well.
To test this theory, 10KB of random data is generated from the PRNG and stored in a file. This file is then compressed, and the sizes of the two files is compared. Ideally, the "compressed" file will be larger than the original, as it contains tables and other details that take up extra room.
To test this theory, 10KB of random data is generated from the PRNG and stored in a file. This file is then compressed, and the sizes of the two files is compared. Ideally, the "compressed" file will be larger than the original, as it contains tables and other details that take up extra room.
Dartboard Test
In this simplified version of Diehard's Parking Lot Test, the PRNG is used to generate the locations of 12,000 darts, which are then plotted on a 100x100 grid. If no dart is in a specific location, it is considered a "hit" and the location is marked as taken.
An unbiased PRNG should be able to place about 2,600 darts out of the 12,000 total.
An unbiased PRNG should be able to place about 2,600 darts out of the 12,000 total.
Distribution Test
A truly random sequence shouldn't favor any number or range of numbers. So, 10,000 results are drawn from the PRNG as floating point numbers between 0 and 1 (inclusive). The results are then listed, allowing for any obvious biases to be easily located in the PRNG's output.
High/Low Test
A simple way to find patterns in a PRNG's output is to look for any relationships between sequential outputs. In this case, an output is considered to be a "high" value if it is above 50% of the PRNG's range, and a "low" value otherwise. 10,000 results are checked to see if there are any noticable patterns, such as a "high" number regularly following a "low" number, or vice versa.
Period Length Test
One of the drawbacks of using a PRNG is that it will eventually reach a point where its internal state either resets or becomes stuck in a loop. The length of time required for this to happen is known as the PRNG's cycle or period.
In order to find a PRNG's period length, the PRNG is queried and its internal state is compared to the previously seen states. If there's a match, then the state has begun repeating and the end of the PRNG's period has been reached. This test is repeated 64,000 times, and if the PRNG's state has never repeated, then the PRNG is considered to have a reasonably long period.
In order to find a PRNG's period length, the PRNG is queried and its internal state is compared to the previously seen states. If there's a match, then the state has begun repeating and the end of the PRNG's period has been reached. This test is repeated 64,000 times, and if the PRNG's state has never repeated, then the PRNG is considered to have a reasonably long period.
Plot Test
Some times the best way to examine data is to see it for yourself, and that's where this test comes in. 50,000 positions on a 250x250 grid are charted using the output of the PRNG. The more times a location appears, the darker its pixel becomes.
Once the initial grid has been created and saved, it is used to generate a second grid. This second grid uses a crude blur filter to reduce the resolution of the original grid, which makes patterns much easier to spot.
Ideally, both grids should look like static or fuzz, indicating that there are no clear patterns in the results. A poor quality PRNG will produce a lot of white spaces in the second grid, if not both.
Once the initial grid has been created and saved, it is used to generate a second grid. This second grid uses a crude blur filter to reduce the resolution of the original grid, which makes patterns much easier to spot.
Ideally, both grids should look like static or fuzz, indicating that there are no clear patterns in the results. A poor quality PRNG will produce a lot of white spaces in the second grid, if not both.
Unique Byte Test
To check for biases in the extreme ends of the output, 255 results are collected and their high and low bytes compared. The idea here is that truly random sequences won't repeat that many bytes, resulting in a large number of unique values. However, if there are a lot of duplicates, then the PRNG is likely creating a non-random pattern of some sort.