Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: repeatedly shuffle number sequence
From
Clinton Thompson <[email protected]>
To
[email protected]
Subject
Re: st: repeatedly shuffle number sequence
Date
Tue, 25 Oct 2011 11:55:38 +0200
Point well-taken, Nick.
Many thanks,
Clint
On Tue, Oct 25, 2011 at 10:45 AM, Nick Cox <[email protected]> wrote:
> This sounds like the sort of problem in which you can spend more time
> working out the most efficient way to do it than actually doing it.
> You can answer your own question by timings with numbers of
> observations and variables close to what you will be using. My own
> instinct is to wonder about creating a long dataset with one variable
> divided into blocks and then finally doing a -reshape wide- but these
> days -sort-s are pretty fast in Stata unless your dataset is enormous.
>
> Nick
>
> On Tue, Oct 25, 2011 at 9:08 AM, Clinton Thompson
> <[email protected]> wrote:
>
>> I'm using Stata/SE 11.2 for WIndows.
>>
>> This is a question that is part programming, part efficiency, and part
>> style. Consider a sequence of numbers, say [1,10], that I want to
>> shuffle/randomize several times such that I end up w/ k variables
>> where each of the variables created contains a random shuffling of the
>> values [1,10]. I approached this using a rather simple and
>> rudimentary -foreach- loop:
>>
>>>>>>>>>>>>>>> BEGIN >>>>>>>>>>>
>>
>> clear
>> set obs 10
>> set seed 20111025
>>
>> foreach num of numlist 1/5 {
>> gen int seq`num' = _n
>> gen rand`num' = runiform()
>> sort rand`num'
>> drop rand`num'
>> }
>>
>> <<<<<<<<<< END <<<<<<<<<<<<<
>>
>> This approach works -- in the sense that k variables are created where
>> each variable contains a random shuffling of the numbers from 1-10 --
>> but I'm not sure if this the best way to approach this kind of
>> problem. Does the creation of a -wide- dataset (as in my approach)
>> make the most sense (I'll be expanding this to 20-25 variables instead
>> of the 5 currently programmed)? And I can easily change the sequences
>> of the values for all of the seq* variables depending on which of the
>> rand* variables is sorted but this doesn't seem too robust. Any
>> thoughts or advice on whether this is the best (read: correct and
>> most efficient?) approach to this problem is most appreciated.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/