Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Clinton Thompson <clintonjthompson@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: repeatedly shuffle number sequence |
Date | Tue, 25 Oct 2011 11:55:38 +0200 |
Point well-taken, Nick. Many thanks, Clint On Tue, Oct 25, 2011 at 10:45 AM, Nick Cox <njcoxstata@gmail.com> wrote: > This sounds like the sort of problem in which you can spend more time > working out the most efficient way to do it than actually doing it. > You can answer your own question by timings with numbers of > observations and variables close to what you will be using. My own > instinct is to wonder about creating a long dataset with one variable > divided into blocks and then finally doing a -reshape wide- but these > days -sort-s are pretty fast in Stata unless your dataset is enormous. > > Nick > > On Tue, Oct 25, 2011 at 9:08 AM, Clinton Thompson > <clintonjthompson@gmail.com> wrote: > >> I'm using Stata/SE 11.2 for WIndows. >> >> This is a question that is part programming, part efficiency, and part >> style. Consider a sequence of numbers, say [1,10], that I want to >> shuffle/randomize several times such that I end up w/ k variables >> where each of the variables created contains a random shuffling of the >> values [1,10]. I approached this using a rather simple and >> rudimentary -foreach- loop: >> >>>>>>>>>>>>>>> BEGIN >>>>>>>>>>> >> >> clear >> set obs 10 >> set seed 20111025 >> >> foreach num of numlist 1/5 { >> gen int seq`num' = _n >> gen rand`num' = runiform() >> sort rand`num' >> drop rand`num' >> } >> >> <<<<<<<<<< END <<<<<<<<<<<<< >> >> This approach works -- in the sense that k variables are created where >> each variable contains a random shuffling of the numbers from 1-10 -- >> but I'm not sure if this the best way to approach this kind of >> problem. Does the creation of a -wide- dataset (as in my approach) >> make the most sense (I'll be expanding this to 20-25 variables instead >> of the 5 currently programmed)? And I can easily change the sequences >> of the values for all of the seq* variables depending on which of the >> rand* variables is sorted but this doesn't seem too robust. Any >> thoughts or advice on whether this is the best (read: correct and >> most efficient?) approach to this problem is most appreciated. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/