Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: New method to avoid looping over each observation and across a variable list
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: New method to avoid looping over each observation and across a variable list
Date
Tue, 22 Jan 2013 20:46:25 +0000
Jeph's excellent advice aside, this wide structure is less convenient
for most data analysis with Stata than a long structure.
You are going to be writing these loops across variables again and again.
Nick
On Tue, Jan 22, 2013 at 8:30 PM, Jeph Herrin <[email protected]> wrote:
> Why are you looping over observations?
>
> gen var_value=.
>
> foreach V of varlist var0201 - var0312 {
> replace var_value = `V' if eventyymm==substr("`V'","var","")
> }
>
>
> should do it.
>
>
>
> On 1/22/2013 2:57 PM, Jeremy Page wrote:
>>
>> Hello Everybody,
>>
>> I would like some advice about how to change some code that currently
>> loops across a long list of variables for each observation in my data
>> set.
>>
>> My data set has one record per person and there are monthly
>> occurrences of variables that have a suffix of yymm (two digit year
>> and two digit month) to record monthly information. I also have a and
>> a string variable which contains the year and month an event
>> (eventyymm) also given as "yymm". I would like produce a new variable
>> which gives the information in var0201-var0312 during the month of the
>> event (eventyymm).
>>
>> The example below contains an example data set and my current code.
>> The code produces the correct result but in my actual data set I have
>> millions of observations and about 15 years of yymm variables to loop
>> over. My current method will take an extremely long time to process.
>>
>> Best,
>> Jeremy
>>
>> ******begin example***********
>> clear all
>> input str5 id str4 eventyymm ///
>> var0201 var0202 var0203 ///
>> var0204 var0205 var0206 ///
>> var0207 var0208 var0209 ///
>> var0210 var0211 var0212 ///
>> var0301 var0302 var0303 ///
>> var0304 var0305 var0306 ///
>> var0307 var0308 var0309 ///
>> var0310 var0311 var0312
>> A 0203 0 0 0 0 0 0 1 1 1 1 1 1 ///
>> 1 1 . . . 0 0 0 0 2 2 2
>> B 0301 . . . . . 0 0 0 0 0 0 0 ///
>> 0 0 0 0 9 9 9 1 1 1 1 1
>> C 0210 1 1 1 1 1 1 1 1 1 1 1 1 ///
>> 1 1 1 3 3 3 3 3 3 3 3 3
>> D 0212 0 0 0 0 0 0 . . . . . . ///
>> . 9 0 0 0 1 1 1 1 1 1 1
>> E 0310 3 3 3 3 3 3 3 3 3 3 3 3 ///
>> 3 3 3 3 3 3 3 3 3 3 3 3
>> end
>>
>> ***generate variable to match with variable name
>> gen varyymm_string = "var" + eventyymm
>>
>> ***generate empty variable to populate in the loop
>> gen var_value = .
>>
>> ***loop over observations
>> foreach i of num 1(1)5 {
>> ***loop across variables
>> foreach x of varlist var0201 - var0312 {
>> replace var_value = `x' if varyymm_string == `"`x'"' & _n == `i'
>> } /* close loop across variables */
>> } /* close loop over observations */
>>
>> ******end example***********
>> *
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/