Thanks Martin. For now I can manage with what I have.
On Wed, Nov 11, 2009 at 2:29 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
> I am sure some combination of -duplicates tag- and -egen, group()- can get
> you there, but I am _way_ over my time limit on this one task. So I hope
> someone else can provide you with an answer.
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von joe j
> Gesendet: Mittwoch, 11. November 2009 13:20
> An: [email protected]
> Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
> to' conditions
>
> Thank you. Yh, the definition for nongroup_f should have been what I
> wrote today, and last night in response to Tim's mail.
> The final goal is:
> (a) contract_id; (b) firm_id (c) nation_id (d) group_d (e)
> group_f
> (f) nongroup_d (g) nongroup_f
> 1 2 US 1 0 0 0
> 1 2 US 1 0 0 0
> 4 3 UK 0 1 0 0
> 4 3 US 0 1 0 0
> 8 3 US 0 0 1 1
> 8 4 UK 0 1 0 1
> 8 4 US 0 1 1 0
> 9 3 US 0 0 1 1
> 9 4 UK 0 0 0 1
> 9 5 US 0 0 1 1
> 10 4 CH 0 1 0 1
> 10 4 UK 0 1 0 1
> 10 5 US 1 0 0 1
> 10 5 US 1 0 0 1
> 10 6 NL 0 0 1 1
> 10 7 NL 0 0 1 1
>
> And, the correct definitions of the last four 'output' variables are:
>
> (d) group_d = 1 when both firm_id and nation_id are same for the
> given observation relative to at least one other observation with
> the same contract_id
> (e) group_f = 1 when firm_id is same but nation_id is different for the
> given observation relative to at least one other observation with
> the same contract_id
> (f) nongroup_d = 1 when firm_id is different but nation_id is same for the
> given observation relative to at least one other observation with
> the same contract_id
> (g) nongroup_f = 1 when both firm_id and nation_id are different for the
> given observation relative to at least one other observation with
> the same contract_id
>
> The first three variables could be derived following your logic, and
> for the last I'd see how to apply your suggestions (I'd also re-read
> Nick's paper).
>
> On Wed, Nov 11, 2009 at 12:40 PM, Martin Weiss <[email protected]> wrote:
>>
>> <>
>>
>>
>> Wait a minute! Seems to me you also changed the definition itself, which
>> triggers a different outcome for this last dummy? Anyway, provide your new
>> final goal, as you did yesterday, together with the correct definitions.
>>
>> I think you can safely omit the -forvalues- loops. Nick was not fond of
> them
>> yesterday, and neat solutions to such problems usually are derived from a
>> judicious combination of -bysort- and some -egen- function(s). This is
>> material covered comprehensively in Nick`s seminal
>> http://www.stata-journal.com/sjpdf.html?articlenum=pr0004. Other commands
>> recently employed for insidious problems of this kind are -expand-,
>> -tempfile- and -merge-...
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von joe j
>> Gesendet: Mittwoch, 11. November 2009 12:06
>> An: [email protected]
>> Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
>> to' conditions
>>
>> Just an update. I discovered that given the definition of nongroup_f
>> "as equals 1 when both firm_id and nation_id are different for the
>> given observation relative to at least one other observation within
>> the same contract_id", the following should be the correct output for
>> contract_id 8 (the columns being contract_id, firm_id, country_id and
>> nongroup_f):
>>
>> 8 3 US 1
>> 8 4 UK 1
>> 8 4 US 0
>> Note that for firm_id 4 for for the US, the value of nongroup_f should
>> be 0. (Indeed I had made a mistake in the output I posted yesterday).
>> While I will use Martin's excellent code for the other three columns
>> (group_d, etc), for the nongroup_f column alone, following Nick's
>> pointers, I found that adding to the IF clause "nation_id[_n+`i']!=."
>> in my clunky code would yield the correct result.
>>
>> forvalues i=1/`=_N'{
>> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
>> (nation_id~=nation_id[_n-`i']) & (nation_id[_n-`i']!=.)
>> }
>> forvalues i=1/`=_N'{
>> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
>> (nation_id~=nation_id[_n+`i']) & (nation_id[_n+`i']!=.)
>> }
>> (I know it doesn't make sense to use _N as the upper limit; I'd
>> perhaps use the number of records in the contract_id with the maximum
>> number of records. I'd also see if Martin's code could be used here as
>> well with modifications)
>>
>> Thanks again for all the help.
>>
>> On Wed, Nov 11, 2009 at 12:18 AM, joe j <[email protected]> wrote:
>>> Sorry, I should have explained it better. nongroup_f = 1 when both
>>> firm_id and nation_id are different for the given observation relative
>>> to "at least one other observation" within the same contract_id. Thus
>>> in the following case of contract_id=10, we have value 1 for all
>>> observations for the nongroup_f variable. Martin's last response gives
>>> the correct result. Thanks, joe.
>>>
>>> 10 4 CH 0 1 0 1
>>> 10 4 UK 0 1 0 1
>>> 10 5 US 1 0 0 1
>>> 10 5 US 1 0 0 1
>>> 10 6 NL 0 0 1 1
>>> 10 7 NL 0 0 1 1
>>>
>>> On Tue, Nov 10, 2009 at 11:54 PM, Tim Wade <[email protected]> wrote:
>>>> Maybe I am missing something obvious here, but I can't follow what you
>>>> are trying to do either. This criterion:
>>>>
>>>>> 4 .nongroup_f = 1 when both firm_id and nation_id are different for
>>>>> two or more observations with the same contract id
>>>>
>>>> does not seem to be consistent with this line listing:
>>>>
>>>>> 10 5 US 1 0 0 1
>>>>> 10 5 US 1 0 0 1
>>>>
>>>> here are two observations with the same firm_id and nation_id yet
>>>> nongroup_f is 1. However, you may want to try looking at some
>>>> combinations of -duplicates, tag- and levelsof, this might help as an
>>>> alternative approach.
>>>>
>>>> Tim
>>>>
>>>>
>>>> On Tue, Nov 10, 2009 at 12:08 PM, joe j <[email protected]> wrote:
>>>>> Thanks. The last 4 columns (group_d; group_f; nongroup_d; nongroup_f)
>>>>> are the final output variables. Their definitions are below the table.
>>>>>
>>>>> ******
>>>>> contract_id; firm_id; nation_id; group_d; group_f; nongroup_d;
>> nongroup_f
>>>>> 1 2 US 1 0 0 0
>>>>> 1 2 US 1 0 0 0
>>>>> 4 3 UK 0 1 0 0
>>>>> 4 3 US 0 1 0 0
>>>>> 8 3 US 0 0 1 1
>>>>> 8 4 UK 0 1 0 1
>>>>> 8 4 US 0 1 1 1
>>>>> 9 3 US 0 0 1 1
>>>>> 9 4 UK 0 0 0 1
>>>>> 9 5 US 0 0 1 1
>>>>> 10 4 CH 0 1 0 1
>>>>> 10 4 UK 0 1 0 1
>>>>> 10 5 US 1 0 0 1
>>>>> 10 5 US 1 0 0 1
>>>>> 10 6 NL 0 0 1 1
>>>>> 10 7 NL 0 0 1 1
>>>>> ******
>>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>>> more observations with the same contract id
>>>>> 2. group_f = 1 when firm_id is same but nation_id is different for
>>>>> two or more observations with the same contract id
>>>>> 3. nongroup_d = 1 when firm_id is different but nation_id is same for
>>>>> two or more observations with the same contract id
>>>>> 4 .nongroup_f = 1 when both firm_id and nation_id are different for
>>>>> two or more observations with the same contract id
>>>>>
>>>>>
>>>>> On Tue, Nov 10, 2009 at 5:47 PM, Martin Weiss <[email protected]>
>> wrote:
>>>>>>
>>>>>> <>
>>>>>>
>>>>>>
>>>>>> For clarification, you could provide the solution, i.e. the dummies
>> that you
>>>>>> actually want to see as your final output, for your chosen example.
>> Makes it
>>>>>> considerably easier to work towards code for you...
>>>>>>
>>>>>>
>>>>>>
>>>>>> HTH
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Ursprüngliche Nachricht-----
>>>>>> Von: [email protected]
>>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>>> Gesendet: Dienstag, 10. November 2009 17:39
>>>>>> An: [email protected]
>>>>>> Betreff: Re: st: AW: forvalues & replace not working under two 'not
>> equal
>>>>>> to' conditions
>>>>>>
>>>>>> Thanks Martin. I think I wasn't clear enough in the last mail. I was
>>>>>> not looking at various combinations of firm_id, nation_id and
>>>>>> contract_id 'for each observation'. Rather I was looking at the
>>>>>> similarity or difference of firm_id/nation_id 'between two or more
>>>>>> observations' under each contract_id.
>>>>>>
>>>>>> Based on Martin's suggestion I could derive group_d (see below). But I
>>>>>> still can't get right nongroup_f, which equals 1 (for all
>>>>>> observations) if firm_id and nation_id are different for two or more
>>>>>> observations under each contract_id (but it takes a value 1, wrongly,
>>>>>> for all observations in the data)
>>>>>>
>>>>>> *deriving group_d (this works)
>>>>>> egen groups=group(firm_id nation_id)
>>>>>>
>>>>>> bys contract_id (groups): /*
>>>>>> */ gen byte distinctcount_group_d= /*
>>>>>> */ (groups[_n]==groups[_n+1])
>>>>>>
>>>>>> bys contract_id (groups): /*
>>>>>> */ replace distinctcount_group_d=1 /*
>>>>>> */ if (groups[_n]==groups[_n-1])
>>>>>>
>>>>>> *2 deriving nongroup_f doesnt work (e.g. it should be 0 for
>> contract_id=1)
>>>>>> bys contract_id (groups): /*
>>>>>> */ gen byte distinctcount_nongroup_f= /*
>>>>>> */ (groups[_n]~=groups[_n+1]) & (nation_id[_n]~=nation_id[_n+1])
>>>>>>
>>>>>> bys contract_id (groups): /*
>>>>>> */ replace distinctcount_nongroup_f=1 /*
>>>>>> */ if (groups[_n]~=groups[_n-1]) & (nation_id[_n]~=nation_id[_n-1])
>>>>>>
>>>>>> On Tue, Nov 10, 2009 at 4:14 PM, Martin Weiss <[email protected]>
>> wrote:
>>>>>>>
>>>>>>> <>
>>>>>>>
>>>>>>> I think a variable denoting the combinations between the three ids is
>> a
>>>>>> good
>>>>>>> place to start for you:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *************
>>>>>>> clear*
>>>>>>> inp byte(contract_id firm_id) nation_id:mylabel, auto
>>>>>>> 1 2 "US"
>>>>>>> 1 2 "US"
>>>>>>> 4 3 "UK"
>>>>>>> 4 3 "US"
>>>>>>> 8 4 "US"
>>>>>>> 8 4 "UK"
>>>>>>> 8 3 "US"
>>>>>>> 9 5 "US"
>>>>>>> 9 4 "UK"
>>>>>>> 9 3 "US"
>>>>>>> 10 5 "US"
>>>>>>> 10 5 "US"
>>>>>>> 10 6 "NL"
>>>>>>> 10 7 "NL"
>>>>>>> 10 4 "UK"
>>>>>>> 10 4 "CH"
>>>>>>> end
>>>>>>>
>>>>>>> egen groups=group(contract_id firm_id nation_id)
>>>>>>>
>>>>>>> l, sepby(con) noobs
>>>>>>> *************
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> HTH
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> -----Ursprüngliche Nachricht-----
>>>>>>> Von: [email protected]
>>>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>>>> Gesendet: Dienstag, 10. November 2009 16:04
>>>>>>> An: [email protected]
>>>>>>> Betreff: st: forvalues & replace not working under two 'not equal to'
>>>>>>> conditions
>>>>>>>
>>>>>>> My dataset has three variables 1. contract_id, 2. firm_id and 3.
>>>>>>> nation_id. I want to create 4 variables, each of which gets a value
> of
>>>>>>> 1 if certain conditions are met. The variables I want to create are
>>>>>>> specific to the contract id, and are:
>>>>>>>
>>>>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>>>>> more firms with the same contract id
>>>>>>> 2. group_f = 1 when firm_id is same but nation_id is different for
>>>>>>> two or more firms with the same contract id
>>>>>>> 3. nongroup_d = 1 when firm_id is different but nation_id is same
> for
>>>>>>> two or more firms with the same contract id
>>>>>>> 4 .nongroup_f = 1 when both firm_id and nation_id are different for
>>>>>>> two or more firms with the same contract id
>>>>>>>
>>>>>>> The following code works well for the first three variables, but not
>>>>>>> for the last, nongroup_f; the value is 1 for all observations. I
> can't
>>>>>>> figure out why.
>>>>>>>
>>>>>>> This is a sample code:
>>>>>>>
>>>>>>> clear
>>>>>>> inp str10(contract_id firm_id nation_id)
>>>>>>> 1 2 "US"
>>>>>>> 1 2 "US"
>>>>>>> 4 3 "UK"
>>>>>>> 4 3 "US"
>>>>>>> 8 4 "US"
>>>>>>> 8 4 "UK"
>>>>>>> 8 3 "US"
>>>>>>> 9 5 "US"
>>>>>>> 9 4 "UK"
>>>>>>> 9 3 "US"
>>>>>>> 10 5 "US"
>>>>>>> 10 5 "US"
>>>>>>> 10 6 "NL"
>>>>>>> 10 7 "NL"
>>>>>>> 10 4 "UK"
>>>>>>> 10 4 "CH"
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>> *1.group_d . WORKS!
>>>>>>> gen group_d=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n-`i'] &
>>>>>>> nation_id==nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n+`i'] &
>>>>>>> nation_id==nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *2.group_f WORKS!
>>>>>>> gen group_f=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n-`i'] &
>>>>>>> nation_id!=nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n+`i'] &
>>>>>>> nation_id!=nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *3. nongroup_d WORKS!
>>>>>>> gen nongroup_d=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n-`i'] &
>>>>>>> nation_id==nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n+`i'] &
>>>>>>> nation_id==nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *4.nongroup_f DOESN'T WORK!!
>>>>>>> gen nongroup_f=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
>>>>>>> (nation_id~=nation_id[_n-`i'])
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
>>>>>>> (nation_id~=nation_id[_n+`i'])
>>>>>>> }
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/statalist/faq
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>
>>>>>>>
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/statalist/faq
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>
>>>>>>
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/statalist/faq
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>>
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/statalist/faq
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>
>>>>> *
>>>>> * For searches and help try:
>>>>> * http://www.stata.com/help.cgi?search
>>>>> * http://www.stata.com/support/statalist/faq
>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/