Hello,
After a bit of experimenting, I've developed a new random number generator
that is quite fast. I've tested it against diehard, dieharder and testu01,
and it passes all tests. (Except for the broken ones in dieharder which
fail all good generators.)
The generator has 2^128 bits of state, and a period of 2^128. However, in
the code below I use the address of the seed as a salt. This causes each
thread in a parallel program to have a different sequence. (Assuming you
use the a separate seed instance per thread to avoid locking issues.)
This means that equivalently, you can think of this generator as having a
period of 2^160+, with each thread being assigned a period 2^128 sub-
sequence. (This assumes that the pointer to the seed has 32bits or so of
state out of the 64 total.)
The generator is based on a 128bit counter that is incremented by an odd
number every iteration. The most significant 64 bits of the state are then
hashed via a reversible hash function. This is combined with the lower
64bits to produce a 64bit pseudo random number.
This generator will produce 2^64 instances of each number from 0 to 2^64-1
in a randomized sequence, ensuring a level of equidistribution.
Since 128 bit integers are non-standard, I provide three implementations
below. If NO_U128B is #defined, then the code will implement the carry from
the bottom 64bits to the top 64bits by hand. (This is the most portable
implementation.) The second implementation uses the "__uint128_t" type of
gcc. The third implementation (used if GCC_ASM is defined), is in assembly
since gcc isn't particularly good at optimizing the use of 128bit integers.
The asm implementation takes 4.2s to generate 10^9 64bit random numbers.
This can be compared to the 64bit KISS generator, which takes 7.3s on the
same machine. Other generators which pass the testu01 bigcrush suite tend
to be even slower.
The constant chosen below was found after a hill-climbing search for a hash
with good avalanche properties. It seems most "random" numbers with about
half their bits set work fairly well.
This generator seems to be ideal for Monte-Carlo applications which require
good random numbers quickly.
Steven
#define GCC_ASM
typedef unsigned long long u64b;
static u64b rng(u64b *s)
{
#ifndef GCC_ASM
u64b c = 7319936632422683419ULL;
u64b z = s[1];
/* Try to increment 128bit counter using adc instruction */
#ifdef NO_U128B
s[0] += c;
s[1] += c + (s[0] < c);
#else
typedef __uint128_t u128b;
union u128_64
{
u64b seed[2];
u128b val;
};
((union u128_64 *)&s[0])->val += c + ((u128b) c << 64);
#endif
/* Hash result */
z ^= (z >> 32) ^ (u64b) s;
z *= c;
z ^= z >> 32;
z *= c;
return z + s[0];
#else
u64b z;
asm volatile(
"mov 0x8(%1),%0\n"
"shr $0x20,%0\n"
"mov $0x6595a395a1ec531b,%%rcx\n"
"xor 0x8(%1),%0\n"
"add %%rcx,(%1)\n"
"adc %%rcx,0x8(%1)\n"
"imul %%rcx,%0\n"
"mov %0,%%rdx\n"
"shr $0x20,%0\n"
"xor %%rdx,%0\n"
"imul %%rcx,%0\n"
"add (%1),%0\n"
:"=r" (z): "rm"(s): "%rcx", "%rdx", "memory", "cc");
return z;
#endif
}
static u64b seed[2];
/* Sample usage */
int main(void)
{
u64b result;
/* Initialize seed */
seed[0] = something
seed[1] = something else.
/* Get a random 64bit integer */
result = rng(&seed);
/* Get another random 64bit integer */
result = rng(&seed);
}