Dear all,
I have a dataset of nearly 3000 couples. One record consists of data of
both the man and the woman. There are some polygamous marriages, so 1
man can have 2 or 3 women instead of 1 woman. In this case, there are 2
or 3 records with the same data for the man and different data for the
women.
For my analyses of men, I only want 1 record of polygamous men.
Therefore, I create duplicates based on idnumbers. Before this
duplication I sorted the data on idnumbers and 1 variable of the men,
namely mobility status. This status is the same for 1 man with different
wives.However, the data of the corresponding wifes will be different. I
noticed that Stata randomly selects the order of the different wives in
polygamous men. See also the following example:
Attempt 1:
data of man 1 - wife 1
data of man 1 - wife 2
data of man 1 - wife 3
Attempt 2:
data of man 1 - wife 3
data of man 1 - wife 1
data of man 1 - wife 2
I kept the first record of every duplicate. This means that in attempt 1
Stata kept wife 1, and in attempt 2 wife 3. This might not give big
problems, but after this I make another selection. I also need to know
the mobility status of women. Suppose, this status is known for wife 1 &
2, but not known (missing) for wife 3. This means that I will find
different numbers in attempt 2. And that is also what happened. I found
small differences in different attempts (number of records: 1519 vs 1517
vs 1512).
My questions:
1. Why does Stata randomly order records after a certain sorting?
2. How can I make sure that every time the same selection is made?
Thanks in advance for your time!
Best wishes,
Debby
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/