Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Default Seed of Stata 12
From
"William Gould, StataCorp LP" <[email protected]>
To
[email protected]
Subject
Re: st: Default Seed of Stata 12
Date
Wed, 24 Oct 2012 11:13:38 -0500
Rasool Bux <[email protected]> asked,
> Can anybody tell me the default system values i.e. seed etc.
> of Stata 12.1
The random-number seed is set to 123456789 each time Stata is launched.
As Maarten Buis <[email protected]> noted, the value changes during
the Stata session as you use the random-number generators.
More information
----------------
I wrote this response mainly so I could say, "123456789", but
Maarten also wrote,
> The default can change during a Stata session.
>
> You can see the current value of the seed by typing di c(seed).
> See -help creturn- for this and other system values.
> Also see -help set seed- for an explanation what that weird string
> returned by -c(seed)- actually is.
and now I feel obligated to provide more details than you will find in
the manuals. So for those who are curious:
The random-number generator has something called a state. When you
-set seed-, you are specifying the state. Each time you ask for
a random number, say by using the -runiform()- function, the
state is recursively updated -- new_state = f(current_state) -- and
then a random number is produced based on the value of new_state.
The code works like this:
random_number:
new_state = f(current_state)
random_number = g(new_state)
current_state = new_state
return(random_number)
Now here's what's interesting: The state has more bits than the
random number. In the case of the KISS random number generator, the
random numbers produced are 32 bit values, and the state is a 128 bit
value! Having more bits for the state than the random number is a
general property of random-number generators and not just a property
of KISS.
When you set the seed, say by typing
. set seed 123456789
you are setting the value of current_state. A number like 123456789
is a 32-bit value. Somehow, that 32-bit value is converted to
a 128-bit value and, no matter how we do it, obviously state can
take on only one of 2^32 values.
The seeting of the sed works like this:
set_seed_32_bit_value:
current_state = h(32_bit_value)
burn in current_state by repeating 100 times {
produce random number (and throw it away)
}
Maarten mentioned -c(seed)- and a second syntax of seed which allows
you to specify the full state. Let me explain.
First off, -c(seed)- is a misleading name because it is not the seed,
it is the state, which is related to the seed. -c(seed)- after setting
the seed to the 32-bit value 123456789 looks like this,
. set seed 123456789
. display c(seed)
X075bcd151f123bb5159a55e50022865700043e55
The strange looking X075bcd151f123bb5159a55e50022865700043e55 is one
way of writing the full 128-bit value. X0765...55 is the result
of running set_seed_32_bit_value on the 32-bit number 123456789.
Remember that the state is updated each time a random number is
generated. Let's look at the state value after generating a random
number:
. * we have already set seed 12345678
. display runiform()
.13698408
. display c(seed)
X5b15215854f24767556efaba82801d9b0004330a
Think of the random-number generator as producing an infinitely long
sequence of states:
-------------------------------------------------------------------------
state0 -> state1 -> state2 -> ... -> state{2^124} -> state0 -> state1 ...
where,
state0 = X075bcd151f123bb5159a55e50022865700043e55,
state1 = X5b15215854f24767556efaba82801d9b0004330a,
and so on,
and where the i-th pseudo random number is given by g(state{i}).
-------------------------------------------------------------------------
The sequence may be infinitely long, but it repeats. The period is
approximately 2^124 in the case of KISS.
The easy-to-type 32-bit seed provides 2^32 entry points into this sequence
---------------------------------------------------------------------
state0 -> state1 -> ... -> state{2^96) -> ... -> state{2^124) -> ...
| | |
123456789 ???????? ??????
---------------------------------------------------------------------
I put ?????? in the above because I didn't bother to work out
the 32-bit numeric values corresponding to the particular states.
What's important is the function state = h(32_bit_seed) is
designed to space the entry points approximately equally.
Also important to understand is that, because the sequence is
infinitely long, my numbering of the states is arbitrary.
I could have picked any one of the 2^124+1 states and labeled it 0.
What's important is that the 32-bit seed provides an entry point
into this sequence. In the last experiment we tried,
. set seed 123456789
. display runiform()
.13698408
. display c(seed)
X5b15215854f24767556efaba82801d9b0004330a
There is no 32-bit seed that you could set that corresponds to that
state.
And that is why the value of -c(seed)- looks so strange: It provides
every possible entry point into the sequence, whereas -set seed #-
provides merely a subset.
Do I have to say it? If this kind of thing interests you, consider a
career at StataCorp.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/