Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: looking for more efficient programming for randomly shuffling list of numbers


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: looking for more efficient programming for randomly shuffling list of numbers
Date   Thu, 26 Aug 2010 16:54:22 +0100

I don't understand most of this, but you could generate all your random numbers as a single variable and assign identifiers 1...20 within blocks of 20. 

Something like this:  

set obs 10000
set seed 2803 
gen long order = _n 
gen random = runiform()
egen block = seq(), block(20)
bysort block (random) : gen sticker = _n 
sort order 

I can't see why you need a more complicated data structure with multiple variables and indeed multiple datasets at all. 

Nick 
[email protected] 

Evelyn Ersanilli

For a survey I need to make stickers that can be used to randomly select respondents within a household.
These stickers should contain the numbers 1-20 in a random order (most households will have less than 20 members). Each sticker should have its own random order, though it is not a problem if by chance some stickers are the same. The stickers should be generated in such a way that each order of numbers 1-20 should be possible and equally likely.

I have written a do-file for this (see below), but it's pretty inefficient.
My first step was to make a data file with one variable that has the numbers 1-20 (so 20 cases). Then I let the do-file run to create 70 random lists of the numbers 1-20. 
The main inefficiency is that I had to create a separate .dta file for each sticker and then merge them. This is not so bad with 70 lists, but with the 9.600 that I will need to create it is pretty annoying.

The recent discussion about need for setting the random seeds also got me a bit worried, but if I understood it correctly it shouldn't affect my problem too much.

All suggestions are welcome.

Kind regards,

Evelyn

******************************
local i = 1
while `i' < 71 {                     //generates 70 vars with random numbers          
gen u`i' = runiform()
local i = `i' + 1
}

foreach order of varlist u1-u70 {
sort `order'
gen order`order'=var1
preserve
keep order`order'
gen n=_n
save "C:\Users\sticker source`order'.dta"
restore
}

clear
gen n=_n
foreach order in u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 u14 u15 u16 u17 u18 u19 u20 u21 u22 u23 u24 u25 u26 u27 u28 ///
u29 u30 u31 u32 u33 u34 u35 u36 u37 u38 u39 u40 u41 u42 u43 u44 u45 u46 u47 u48 u49 u50 u51 u52 u53 u54 u55 u56 u57 u58 u59 u60 ///
u61 u62 u63 u64 u65 u66 u67 u68 u69 u70{
merge 1:1 _n using "C:\Users\sticker source`order'.dta", noreport nogenerate
}
drop n
save "C:\Users\sticker pilot 1 marok.dta", replace
xpose, clear
outsheet using "C:\Users\sticker pilot 1 marok - transposed.out", replace
***************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index