Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Summarizing properties of other group members meeting specified conditions
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Summarizing properties of other group members meeting specified conditions
Date
Tue, 28 Jun 2011 13:50:55 +0100
I think the flip but also fair answer is that you got what you asked
for. The maximum in each case is over the other -org_id-. You need to
translate the Stata statements back to your verbal specifications to
see that you are calculating what you want. My warning still applies.
Nick
On Tue, Jun 28, 2011 at 12:27 PM, Erik Aadland <[email protected]> wrote:
> I modified the suggested code a little bit, moving the cond_2 == 1 to the replace line, and now it almost works as intended.
>
> The modified code is:
>
> sort project_id org_id
>
> gen maxwins = .
> su org_id, meanonly
> quietly forvalues i = 1/`r(max)' {
> gen include = 1 if org_id != `i' & cond_1 == 1
> egen x = max(var_2 * include), by(project_id)
> replace maxwins = x if org_id == `i' & cond_2 == 1
> drop include x
> }
>
> However, I do get a strange result when I have more than one org_id with cond_1 == 1 in the same project_id.
>
> Here is the resulting output:
>
> project_id org_id cond_1 cond_2 cond_3 var_1 var_2 maxwins
> 1 1 0 1 0 3 0 3
> 1 2 1 0 0 0 3 .
> 1 3 0 0 1 0 1 .
> 2 4 0 1 0 2 0 3
> 2 4 1 0 0 0 4 .
> 2 5 1 0 0 0 3 .
> 2 6 0 1 0 1 0 4
> 3 8 0 1 0 1 0 .
>
>
> Why does org_id 4 in project_id 2 get a maxwins value of 3 when the max var_2 value for org_id with cond_1 == 1 is 4?
> And why does org_id 6 in project_id 2 get the correct maxwins value of 4 when org_id 4 does not?
>
> Kind regards,
>
> Erik Aadland.
>
>
>
>
>
> ----------------------------------------
>> Date: Tue, 28 Jun 2011 11:32:53 +0100
>> Subject: Re: st: Summarizing properties of other group members meeting specified conditions
>> From: [email protected]
>> To: [email protected]
>>
>> I guess that your original typo remains in
>>
>> gen include 1 = if org_id != `i' & cond_1 == 1
>>
>> which should be more like
>>
>> gen include = 1 if org_id != `i' & cond_1 == 1
>>
>> Warning: I am not sure that I have grasped your verbal summary of what
>> you want. My emphasis is on showing you Stata techniques.
>>
>> Nick
>>
>> On Tue, Jun 28, 2011 at 11:16 AM, Erik Aadland <[email protected]> wrote:
>> > Thank you so much for your help and input, Nick. I sincerely appreciate it.
>> >
>> > When testing the suggested code, after having sorted on project_id and org_id, I get an error message for the following line of code:
>> >
>> > quietly forvalues i = 1/`r(max)' {
>> >
>> > The error message is:
>> >
>> > 1 invalid name
>> > r(198);
>> >
>> > How to get around this problem?
>> >
>> >
>> > Kind regards,
>> >
>> > Erik Aadland.
>> >
>> >
>> >
>> >
>> >> Date: Tue, 28 Jun 2011 08:20:57 +0100
>> >> Subject: Re: st: Summarizing properties of other group members meeting specified conditions
>> >> From: [email protected]
>> >> To: [email protected]
>> >>
>> >> There was another typo in the line I corrected (you can't have two
>> >> -if-s) and it is not clear where your r(max) comes from. This may be
>> >> closer, but I've not tested it.
>> >>
>> >> gen maxwins = .
>> >> su org_id, meanonly
>> >> quietly forvalues i = 1/`r(max)' {
>> >> gen include 1 = if org_id != `i' & cond_1 == 1
>> >> egen x = max(var_2 * include) if cond_2 == 1, by(project_id)
>> >> replace maxwins = x if org_id == `i'
>> >> drop include x
>> >> }
>> >>
>> >>
>> >>
>> >> On Mon, Jun 27, 2011 at 3:26 PM, Erik Aadland <[email protected]> wrote:
>> >> > Dear statalist.
>> >> >
>> >> > After having read the following FAQ http://www.stata.com/support/faqs/data/members.html and experimented with solutions, I am still struggling.
>> >> >
>> >> > For each project_id, I am trying to generate a variable that for each org_id that meets a specified condition generates the highest observed value on a given variable for another org_id that meets a different specified condition. I think an example is in order:
>> >> >
>> >> > Example structure:
>> >> >
>> >> > project_id org_id cond_1 cond_2 cond_3 var_1 var_2
>> >> > 1 1 0 1 0 3 0
>> >> > 1 2 1 0 0 0 3
>> >> > 1 3 0 0 1 0 1
>> >> > 2 4 0 1 0 2 0
>> >> > 2 4 1 0 0 0 4
>> >> > 2 5 1 0 0 0 3
>> >> > 2 6 0 1 0 1 0
>> >> > 3 8 0 1 0 1 0
>> >> >
>> >> >
>> >> > For instance, for each project_id I try generate a variable that for each org_id meeting cond_2 == 1 produces the highest value of var_2 for org_id that meets cond_1 == 1. In the case of project_id 1 above org_id 1 should get a score of 3, while org_id 2 and 3 should get a missing value. Some project_ids may have only one observation (consist of a single org_id) as in project_id 3 above, or may not have other org_ids represented that meet the specified condition. Then, it is of course not possible to generate a value on the variable.
>> >> >
>> >> > I have tried with the following code, but I cannot figure out how to specify cond_2 == 1, and it comes out all wrong (Stata does not even accept my code):
>> >> >
>> >> > sort project_id org_id
>> >> > gen maxwins = .
>> >> > quietly forvalues i = 1/`r(max)' {
>> >> > gen include 1 = if org_id != `i' if cond_1 == 1
>> >> > egen x = max(var_2 * include), by(project_id)
>> >> > replace maxwins = x if org_id == `i'
>> >> > drop include x
>> >> > }
>> >> >
>> >> > Is it possible to do this without loops also?
>> >>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/