Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Replace current values matching certain condition using values from other observations?
From
Chris Yang <[email protected]>
To
[email protected]
Subject
Re: st: Re: Replace current values matching certain condition using values from other observations?
Date
Sat, 25 Jan 2014 12:52:53 -0500
Thank you Joseph.
As a related question, if I have many variables that need to be
replaced like this, is there a quick way to do them all at once
instead of doing it one-by-one for each variable. Perhaps I could use
a loop to loop through the variables, but if all the variable names
follow a pattern, e.g. var*, is there a shortcut?
I've tried: bysort group (seq): replace var*=var*[1] if seq > 1 for
the example given, but it gives an error saying: "var ambiguous
abbreviation".
On Sat, Jan 25, 2014 at 3:33 AM, Joseph Coveney <[email protected]> wrote:
> Chris Yang wrote:
>
> I have a dataset that resembles the following structure:
>
> group seq var1 var2 var3
> 1 1 3 2 3
> 1 2 1 1 2
> 1 3 2 2 4
> 2 1 3 2 1
> 2 2 3 3 3
> 3 1 3 2 1
> ...
>
> Now, within each group, for all the observations whose seq > 1, I want
> to replace the values for var1 and var2 with those of the observation
> whose seq == 1. For example, for group 1 above, after the
> replacements, it would look like this:
>
> group seq var1 var2 var3
> 1 1 3 2 3
> 1 2 3 2 2
> 1 3 3 2 4
>
> If there is no seq > 1 in a given group, then no replacement is needed.
>
> Intuitively, it seems that I need a looping structure to go through
> all the observations one by one. And at each step i will check the seq
> variable. If it is greater than 1, then look up the values for var1
> and var2 from the observation *within the same group* whose seq == 1,
> and use them to update the current observation. The question is that
> how do I do such look-ups in a loop?
>
> As always, is there a better/more efficient way of doing it? Your
> thoughts and suggestions are appreciated.
>
> --------------------------------------------------------------------------------
>
> In Stata, you tend to avoid looping over data. You can often take advantage of
> the fact that its data operations are "vectorized".
>
> Joseph Coveney
>
> . input byte(group seq var1 var2 var3)
>
> group seq var1 var2 var3
> 1. 1 1 3 2 3
> 2. 1 2 1 1 2
> 3. 1 3 2 2 4
> 4. 2 1 3 2 1
> 5. 2 2 3 3 3
> 6. 3 1 3 2 1
> 7. end
>
> .
> . bysort group (seq): replace var1 = var1[1] if seq > 1
> (2 real changes made)
>
> . by group: replace var2 = var2[1] if seq > 1
> (2 real changes made)
>
> .
> . list, noobs sepby(group)
>
> +----------------------------------+
> | group seq var1 var2 var3 |
> |----------------------------------|
> | 1 1 3 2 3 |
> | 1 2 3 2 2 |
> | 1 3 3 2 4 |
> |----------------------------------|
> | 2 1 3 2 1 |
> | 2 2 3 2 3 |
> |----------------------------------|
> | 3 1 3 2 1 |
> +----------------------------------+
>
> .
> . exit
>
> end of do-file
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/