Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: AW: forvalues & replace not working under two 'not equal to' conditions


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   AW: st: AW: forvalues & replace not working under two 'not equal to' conditions
Date   Wed, 11 Nov 2009 14:29:30 +0100

<> 

I am sure some combination of -duplicates tag- and -egen, group()- can get
you there, but I am _way_ over my time limit on this one task. So I hope
someone else can provide you with an answer.



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von joe j
Gesendet: Mittwoch, 11. November 2009 13:20
An: [email protected]
Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
to' conditions

Thank you. Yh, the definition for nongroup_f should have been what I
wrote today, and last night in response to Tim's mail.
The final goal is:
(a) contract_id;	(b) firm_id	(c) nation_id (d) group_d (e)
group_f
(f) nongroup_d (g) nongroup_f
1	2	US	1	0	0	0
1	2	US	1	0	0	0
4	3	UK	0	1	0	0
4	3	US	0	1	0	0
8	3	US	0	0	1	1
8	4	UK	0	1	0	1
8	4	US	0	1	1	0
9	3	US	0	0	1	1
9	4	UK	0	0	0	1
9	5	US	0	0	1	1
10	4	CH	0	1	0	1
10	4	UK	0	1	0	1
10	5	US	1	0	0	1
10	5	US	1	0	0	1
10	6	NL	0	0	1	1
10	7	NL	0	0	1	1

And, the correct definitions of the last four 'output' variables are:

(d)  group_d = 1 when both firm_id and nation_id are same for the
 given observation relative to at least one other observation with
the same contract_id
(e)  group_f = 1  when firm_id is same but nation_id is different for the
 given observation relative to at least one other observation with
the same contract_id
(f)  nongroup_d = 1  when firm_id is different but nation_id is same for the
 given observation relative to at least one other observation with
the same contract_id
(g) nongroup_f = 1  when both firm_id and nation_id are different for the
 given observation relative to at least one other observation with
the same contract_id

The first three variables could be derived following your logic, and
for the last I'd see how to apply your suggestions (I'd also re-read
Nick's paper).

On Wed, Nov 11, 2009 at 12:40 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
>
> Wait a minute! Seems to me you also changed the definition itself, which
> triggers a different outcome for this last dummy? Anyway, provide your new
> final goal, as you did yesterday, together with the correct definitions.
>
> I think you can safely omit the -forvalues- loops. Nick was not fond of
them
> yesterday, and neat solutions to such problems usually are derived from a
> judicious combination of -bysort- and some -egen- function(s). This is
> material covered comprehensively in Nick`s seminal
> http://www.stata-journal.com/sjpdf.html?articlenum=pr0004. Other commands
> recently employed for insidious problems of this kind are -expand-,
> -tempfile- and -merge-...
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von joe j
> Gesendet: Mittwoch, 11. November 2009 12:06
> An: [email protected]
> Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
> to' conditions
>
> Just an update. I discovered that given the definition of nongroup_f
> "as equals  1  when both firm_id and nation_id are different for the
> given observation relative to at least one other observation within
> the same contract_id", the following should be the correct output for
> contract_id 8 (the columns being contract_id, firm_id, country_id and
> nongroup_f):
>
> 8       3       US      1
> 8       4       UK      1
> 8       4       US      0
> Note that for firm_id 4 for for the US, the value of nongroup_f should
> be 0. (Indeed I had made a mistake in the output I posted yesterday).
> While I will use Martin's excellent code for the other three columns
> (group_d, etc), for the nongroup_f column alone, following Nick's
> pointers, I found that adding to the IF clause "nation_id[_n+`i']!=."
> in my clunky code would yield the correct result.
>
> forvalues i=1/`=_N'{
> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
> (nation_id~=nation_id[_n-`i']) & (nation_id[_n-`i']!=.)
> }
> forvalues i=1/`=_N'{
> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
> (nation_id~=nation_id[_n+`i']) & (nation_id[_n+`i']!=.)
> }
> (I know it doesn't make sense to use _N as the upper limit; I'd
> perhaps use the number of records in the contract_id with the maximum
> number of records. I'd also see if Martin's code could be used here as
> well with modifications)
>
> Thanks again for all the help.
>
> On Wed, Nov 11, 2009 at 12:18 AM, joe j <[email protected]> wrote:
>> Sorry, I should have explained it better. nongroup_f = 1  when both
>> firm_id and nation_id are different for the given observation relative
>> to "at least one other observation" within the same contract_id. Thus
>> in the following case of contract_id=10, we have value 1 for all
>> observations for the nongroup_f variable. Martin's last response gives
>> the correct result. Thanks, joe.
>>
>> 10      4       CH      0       1       0       1
>> 10      4       UK      0       1       0       1
>> 10      5       US      1       0       0       1
>> 10      5       US      1       0       0       1
>> 10      6       NL      0       0       1       1
>> 10      7       NL      0       0       1       1
>>
>> On Tue, Nov 10, 2009 at 11:54 PM, Tim Wade <[email protected]> wrote:
>>> Maybe I am missing something obvious here, but I can't follow what you
>>> are trying to do either. This criterion:
>>>
>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>> two or more observations with the same contract id
>>>
>>> does not seem to be consistent with this line listing:
>>>
>>>> 10      5       US      1       0       0       1
>>>> 10      5       US      1       0       0       1
>>>
>>> here are two observations with the same firm_id and nation_id yet
>>> nongroup_f is 1. However, you may want to try looking at some
>>> combinations of -duplicates, tag- and levelsof, this might help as an
>>> alternative approach.
>>>
>>> Tim
>>>
>>>
>>> On Tue, Nov 10, 2009 at 12:08 PM, joe j <[email protected]> wrote:
>>>> Thanks. The last 4 columns (group_d; group_f; nongroup_d; nongroup_f)
>>>> are the final output variables. Their definitions are below the table.
>>>>
>>>> ******
>>>> contract_id; firm_id; nation_id; group_d; group_f; nongroup_d;
> nongroup_f
>>>> 1       2       US      1       0       0       0
>>>> 1       2       US      1       0       0       0
>>>> 4       3       UK      0       1       0       0
>>>> 4       3       US      0       1       0       0
>>>> 8       3       US      0       0       1       1
>>>> 8       4       UK      0       1       0       1
>>>> 8       4       US      0       1       1       1
>>>> 9       3       US      0       0       1       1
>>>> 9       4       UK      0       0       0       1
>>>> 9       5       US      0       0       1       1
>>>> 10      4       CH      0       1       0       1
>>>> 10      4       UK      0       1       0       1
>>>> 10      5       US      1       0       0       1
>>>> 10      5       US      1       0       0       1
>>>> 10      6       NL      0       0       1       1
>>>> 10      7       NL      0       0       1       1
>>>> ******
>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>> more observations with the same contract id
>>>> 2. group_f = 1  when firm_id is same but nation_id is different for
>>>> two or more observations with the same contract id
>>>> 3. nongroup_d = 1  when firm_id is different but nation_id is same for
>>>> two or more observations with the same contract id
>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>> two or more observations with the same contract id
>>>>
>>>>
>>>> On Tue, Nov 10, 2009 at 5:47 PM, Martin Weiss <[email protected]>
> wrote:
>>>>>
>>>>> <>
>>>>>
>>>>>
>>>>> For clarification, you could provide the solution, i.e. the dummies
> that you
>>>>> actually want to see as your final output, for your chosen example.
> Makes it
>>>>> considerably easier to work towards code for you...
>>>>>
>>>>>
>>>>>
>>>>> HTH
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: [email protected]
>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>> Gesendet: Dienstag, 10. November 2009 17:39
>>>>> An: [email protected]
>>>>> Betreff: Re: st: AW: forvalues & replace not working under two 'not
> equal
>>>>> to' conditions
>>>>>
>>>>> Thanks Martin. I think I wasn't clear enough in the last mail. I was
>>>>> not looking at various combinations of firm_id, nation_id and
>>>>> contract_id 'for each observation'. Rather I was looking at the
>>>>> similarity or difference of firm_id/nation_id 'between two or more
>>>>> observations' under each contract_id.
>>>>>
>>>>> Based on Martin's suggestion I could derive group_d (see below). But I
>>>>> still can't get right nongroup_f, which equals 1 (for all
>>>>> observations) if firm_id and nation_id are different for two or more
>>>>> observations under each contract_id (but it takes a value 1, wrongly,
>>>>> for all observations in the data)
>>>>>
>>>>> *deriving group_d (this works)
>>>>> egen groups=group(firm_id nation_id)
>>>>>
>>>>> bys contract_id (groups):  /*
>>>>> */ gen byte distinctcount_group_d= /*
>>>>> */ (groups[_n]==groups[_n+1])
>>>>>
>>>>> bys contract_id (groups):  /*
>>>>> */ replace distinctcount_group_d=1 /*
>>>>> */ if (groups[_n]==groups[_n-1])
>>>>>
>>>>> *2 deriving nongroup_f doesnt work (e.g. it should be 0 for
> contract_id=1)
>>>>> bys contract_id (groups):  /*
>>>>> */ gen byte distinctcount_nongroup_f= /*
>>>>> */ (groups[_n]~=groups[_n+1]) & (nation_id[_n]~=nation_id[_n+1])
>>>>>
>>>>> bys contract_id (groups):  /*
>>>>> */ replace distinctcount_nongroup_f=1 /*
>>>>> */ if (groups[_n]~=groups[_n-1]) & (nation_id[_n]~=nation_id[_n-1])
>>>>>
>>>>> On Tue, Nov 10, 2009 at 4:14 PM, Martin Weiss <[email protected]>
> wrote:
>>>>>>
>>>>>> <>
>>>>>>
>>>>>> I think a variable denoting the combinations between the three ids is
> a
>>>>> good
>>>>>> place to start for you:
>>>>>>
>>>>>>
>>>>>>
>>>>>> *************
>>>>>> clear*
>>>>>> inp byte(contract_id firm_id) nation_id:mylabel, auto
>>>>>> 1   2   "US"
>>>>>> 1   2   "US"
>>>>>> 4   3   "UK"
>>>>>> 4   3   "US"
>>>>>> 8   4   "US"
>>>>>> 8   4   "UK"
>>>>>> 8   3   "US"
>>>>>> 9   5   "US"
>>>>>> 9   4   "UK"
>>>>>> 9   3   "US"
>>>>>> 10   5   "US"
>>>>>> 10   5   "US"
>>>>>> 10   6   "NL"
>>>>>> 10   7   "NL"
>>>>>> 10   4   "UK"
>>>>>> 10   4   "CH"
>>>>>> end
>>>>>>
>>>>>> egen groups=group(contract_id firm_id nation_id)
>>>>>>
>>>>>> l, sepby(con) noobs
>>>>>> *************
>>>>>>
>>>>>>
>>>>>>
>>>>>> HTH
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Ursprüngliche Nachricht-----
>>>>>> Von: [email protected]
>>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>>> Gesendet: Dienstag, 10. November 2009 16:04
>>>>>> An: [email protected]
>>>>>> Betreff: st: forvalues & replace not working under two 'not equal to'
>>>>>> conditions
>>>>>>
>>>>>> My dataset has three variables 1. contract_id, 2. firm_id and 3.
>>>>>> nation_id. I want to create 4 variables, each of which gets a value
of
>>>>>> 1 if certain conditions are met. The variables I want to create are
>>>>>> specific to the contract id, and are:
>>>>>>
>>>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>>>> more firms with the same contract id
>>>>>> 2. group_f = 1  when firm_id is same but nation_id is different for
>>>>>> two or more firms with the same contract id
>>>>>> 3. nongroup_d = 1  when firm_id is different but nation_id is same
for
>>>>>> two or more firms with the same contract id
>>>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>>>> two or more firms with the same contract id
>>>>>>
>>>>>> The following code works well for the first three variables, but not
>>>>>> for the last, nongroup_f; the value is 1 for all observations. I
can't
>>>>>> figure out why.
>>>>>>
>>>>>> This is a sample code:
>>>>>>
>>>>>> clear
>>>>>> inp str10(contract_id firm_id   nation_id)
>>>>>> 1   2   "US"
>>>>>> 1   2   "US"
>>>>>> 4   3   "UK"
>>>>>> 4   3   "US"
>>>>>> 8   4   "US"
>>>>>> 8   4   "UK"
>>>>>> 8   3   "US"
>>>>>> 9   5   "US"
>>>>>> 9   4   "UK"
>>>>>> 9   3   "US"
>>>>>> 10   5   "US"
>>>>>> 10   5   "US"
>>>>>> 10   6   "NL"
>>>>>> 10   7   "NL"
>>>>>> 10   4   "UK"
>>>>>> 10   4   "CH"
>>>>>> end
>>>>>>
>>>>>>
>>>>>> *1.group_d . WORKS!
>>>>>> gen group_d=.
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n-`i'] &
>>>>>> nation_id==nation_id[_n-`i']
>>>>>> }
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n+`i'] &
>>>>>> nation_id==nation_id[_n+`i']
>>>>>> }
>>>>>>
>>>>>> *2.group_f  WORKS!
>>>>>> gen group_f=.
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n-`i'] &
>>>>>> nation_id!=nation_id[_n-`i']
>>>>>> }
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n+`i'] &
>>>>>> nation_id!=nation_id[_n+`i']
>>>>>> }
>>>>>>
>>>>>> *3. nongroup_d  WORKS!
>>>>>> gen nongroup_d=.
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n-`i'] &
>>>>>> nation_id==nation_id[_n-`i']
>>>>>> }
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n+`i'] &
>>>>>> nation_id==nation_id[_n+`i']
>>>>>> }
>>>>>>
>>>>>> *4.nongroup_f DOESN'T WORK!!
>>>>>> gen nongroup_f=.
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
>>>>>> (nation_id~=nation_id[_n-`i'])
>>>>>> }
>>>>>> forvalues i=1/`=_N'{
>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
>>>>>> (nation_id~=nation_id[_n+`i'])
>>>>>> }
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>>
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index