# Porting ISAAC to Python

3 months ago, I discovered an algorithm called ISAAC on reddit. ISAAC (Indirection, Shift, Accumulate, Add, and Count) is a cryptographically secure pseudorandom number generator (CSPRNG) designed in 1996 by Bob Jenkins. It produces 32-bit unsigned integers uniformly distributed, unbiased, and unpredictable. Cycles are guaranteed to be a minimum of 2^{40} values long and they are 2^{8295} values long on average. The generator runs fast and performs 18.75 machine cycles on average .

I’ve wanted to write my first Python extension for a long time; the short C implementation combined with a simple API solely composed of two functions – one for initializing the generator, one for generating random numbers – made ISAAC a great candidate for this task.

#### Writing Python extensions is easy

Given my inexperience, this post is not going to be an umpteenth tutorial about writing extensions, I would rather link you toward resources I found useful for this purpose. However, I would like to point out how easy it is to write extensions: Python’s documentation is clear, extensive, and provides a great example that guides you step by step. In addition, the C API is simple and well documented too. This makes writing extensions a very pleasant experience.

Here’s what you need to get started:

- Extending and Embedding the Python Interpreter, Python’s official documentation;
- Python Extension Programming with C, Tutorials Point;
- Python modules in C, Dan Foreman-Mackey.

The rest depends on your C skills. Hello, segmentation fault!

#### Code

Since, I’m a very fancy guy, I named the package… drum roll… “*pyisaac*”! The code is available on GitHub under MIT license. I’m far from being a C expert, comments, and PR are very welcome. The package is hosted on PyPI and can be installed via `pip`

or `easy_install`

:

#### Use

As for the original implementation, the API is simple:

#### Implementation

##### seeding routine

Bob Jenkins provides no official seeding routine for the algorithm. At first, as suggested by redditor BonzaiThePenguin, I used 1024 bytes of data from `/dev/random`

. Since this source of randomness is potentially blocking (more about it here and there) and not cross-platform, later, I’ve decided later to rely on Python `os.urandom()`

. On a UNIX-like system this function will query `/dev/urandom`

and on Windows it will use `CryptGenRandom()`

. If no randomness source is found, the algorithm is seeded from the system time and **then MUST NOT be used for cryptographic purposes**.

##### when unsigned long integers were short

I have slightly modified the original file standard.h. ISAAC relies on 4-byte integers arithmetic. Back in 1996, unsigned long integers were 4-byte long but nowadays, they are an 8-byte quantity on 64-bit processors. So I replaced them with unsigned integers instead.

##### mapping integers to floats in [0, 1]

ISAAC generates random integers ranging from 0 to 2^{32}-1. In theory dividing them by 2^{32}-1 should produce distinct floating point numbers between 0 and 1. In practice, this doesn’t work very well, as illustrated by the following C code snippet:

Output:

This behavior is related to how floating point numbers are represented. The precision offered by the float type is not enough to account for the infinitesimal increment between `a`

and `b`

and causes collisions.

##### double to the rescue

To circumvent this problem, *pyisaac* relies on the double type. A double is 64-bit long whereas a float is 32-bit and is sufficiently accurate to map each integer to a distinct floating point number between 0 and 1.

##### one and one makes two

This approach presents a drawback: the generated doubles have a 32-bit resolution and do not make the most of the extra precision brought by the double type. I’ve taken a look at how CPython random generator is implemented and the core developers came out with an elegant solution: two 32-bit random numbers are generated in order to “fill” one double and achieve 53-bit resolution:

Neat, isn’t it?

#### Testing

##### testing the implementation

Bob Jenkins provides a seed and the first 2600 random numbers generated from it. *pyisaac* is tested against these numbers and gets them right… but this only means the implementation is right, not that the algorithm generates truly uniformly distributed, unbiased, and unpredictable numbers.

##### testing the algorithm

In fact, it is impossible to mathematically demonstrate that a generator is indeed a random bit generator. Consequently, generators are submitted to batteries of statistical tests which aim to detect certain kinds of weaknesses such as:

- shorter than expected periods for some seed states;
- lack of uniformity of distribution for large amounts of generated numbers;
- correlation of successive values, etc.

Dieharder is an example of such a battery of tests. ISAAC successfully passes all the Diehard and NIST statistical tests. Both these test suites are part of Dieharder.

#### Performance

Let’s get a rough sense of how *pyisaac* performs compared to `random.random()`

from the standard library and `numpy.random.rand()`

. These PRNGs implement the Mersenne twister algorithm:

As can be seen, *pyisaac* runs within the same order of magnitude. And… yes… Python adds much overhead compared to the native implementation… But starting a flame war is not the point of this post :)

That’s all folks! I’d like to thank Mike McFadden who gave me valuable advice on how to seed the generator properly.

comments powered by Disqus