Re: st: How to create a random number identifier number

From   Michael McCulloch <[email protected]>
To   [email protected]
Subject   Re: st: How to create a random number identifier number
Wed, 11 Nov 2009 19:16:06 -0800

This simulated example is a better approach, that is faithful to your need for the newpersonid to have 5 digits.

********* begin example
set obs 11000
gen personid=_n
replace personid=personid+10000 if personid<10000
gen sortvar=1 + int(11000*uniform())

replace sortvar=sortvar+10000 if sort<10000
sort sortvar

gen newpersonid str5=_n
destring newpersonid, replace
replace newpersonid=newpersonid+50000 if newpersonid<11000

list personid newpersonid in 10050/11000
********* end example

Dear Anna, if you sort on some variable other than personid, or perform a random sort, you could then:
	gen new_personid = _n
This creates a variable which has a value equal to the sequence # of that record, which is why you have to create some sort order other than personid.

On Nov 11, 2009, at 6:37 PM, Anna Reimondos wrote:

I am experiencing problems creating a unique set of number for my dataset.

I have a dataset with around 11,000 subjects or persons, and each one
of these subjects has a unique identifier that is 5 digits long
I need to create a concordance file which list the original 5 digit
"personid" and matches this to another new randomly created identifier
for each person. This new identifier (new_personid) also has to be 5
digits long.

personid   new_personid
10526        35624
18594        21893
54632        12489

I have tried playing around with the gen  x = uniform() function but
to no avail. I am unable to create exactly 11,000 unique numbers with
5 digits.
I also tried just using the egen x=se() command, but then the ids are
sequential and not random and I am afraid then perhaps someone could
figure out how to match the personid and the newperson id....

Any help would be much appreciated,


(Using STATA 10.1, Windows Vista)

