Tuesday, August 31, 2010

16 bit xorshift rng (now with more period)

So, in Marsaglia's paper, the basic xorshift rng the code is:
yˆ=(y<<a); y^=(y>>b); return (yˆ=(y<<c));

I looked at this in my last post, and had a brief look at the (a,b,c) triplets to use. However, this only yields a period length of the word size used. In my case, that's 2^16-1 -- not much. However, further on in the paper, section 3.1, a method is described for longer periods through additional state and promotion:
t=(xˆ(x<<a)); x=y; return y=(yˆ(y>>c))ˆ(tˆ(t>>b));

This is listed as 2^64-1 using 32-bit words, but there's no reason we can't use 16-bit words. This reduces the period to 2^32-1, and requires new triplets. I scanned (1,1,1) through till (15,15,15), here's the list:


Values marked with an asterisk score well on the majority of diehard tests. It's probably not a good idea to use the other values, even though they do have a complete period. Notably, all values fail binary matrices tests for 31x31 and 32x32 - I think this is expected, and mentioned in Marsaglia2003.

So, the routine in C:
uint16_t rnd_xorshift_32() {
static uint16_t x=1,y=1;
uint16_t t=(x^(x<<5));
return y=(y^(y>>1))^(t^(t>>3));

and in assembly:
; returns a 16-bit value >0, has a 2^32-1 period
push cx
push dx
db 0b8h ; t=(x^(x shl a))
random_x: dw 1
mov dx,ax
mov cl,5
shl dx,cl
xor ax,dx
mov dx,ax ; y=(y^(y shr c))^(t^(t shr b))
mov cl,3 ; ^^^^^^^^^^^
shr dx,cl
xor ax,dx
push ax ; save t^(t shr b)
db 0b8h
random_y: dw 1
mov cs:[random_x],ax ; x=y
mov dx,ax ; y=(y^(y shr c))^(t^(t shr b))
shr dx,1 ; ^^^^^^^^^^^
xor ax,dx
pop dx
xor ax,dx
mov cs:[random_y],ax
pop dx
pop cx


Ben Jamin' said...

Which triplet passes the most tests? (5,3,1)?

Brad Forschinger said...

From memory the asterisked triplets performed equally well.

Russell Borogove said...

Thanks for this - I was looking for a suitable RNG to run on an AVR microcontroller (8-bit, but the register file is generous enough that the compiler can handle wider operations reasonably) and 64-bit xorshift seemed overkill.

Brad Forschinger said...

To avoid register churn you can use 8-bit sizes and the form mentioned in 3.1:

static uint8_t Q[4]={1,1,1,1};
uint8_t g(unsigned int i, unsigned int j, unsigned int k, unsigned int l)
uint8_t t;
t = (Q[0] ^ (Q[0] << i)) ^
(Q[1] ^ (Q[1] >> j)) ^
(Q[2] ^ (Q[2] << k)) ^
(Q[3] ^ (Q[3] << l));
Q[0] = Q[1]; Q[1] = Q[2]; Q[2] = Q[3]; Q[3] = t;
return t;

where (i,j,k,l) is one of:


hopper said...

Is it necessary to start with x = y = 1 or can I use any non-zero values?



hopper said...

I'm also curious how to test 16 bit values with Diehard or other test suite. As far as I knew, they mostly expect 32 bit values?

Brad Forschinger said...

Hi M,

For initialization you'll need to ensure `x != 0 || y != 0`.

I used the Diehard test suite at the time, but nowadays you can use `PractRand stdin16`.


hopper said...

Thanks Brad

Yes I had already made sure that x and y were both non-zero. I am having to take them outside the function - that they are sent as seed arguments so that I can use this PRNG with OpenMP in parallel threads. It's working well.

I have tried PractRand now (v0.93 & v0.95) and it works for stdin but claims an input error with stdin16. Not sure why.

hopper said...

By the way, testing your asterisk parameter sets with PractRand stdin also chooses {5,3,1} as failing the fewest tests.