|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Logical condition
Dmitriy Krichevskiy <[email protected]> writes,
> I have a large panel dataset for which I was hoping to analyze a
> particular subset of agents. More specifically:
>
> id time b1 i1
> 1 1 1 3
> 1 2 1 4
> 1 3 0 2
> 2 1 0 5
> 2 2 1 6
> 2 3 0 4
> 3 1 0 2
> 3 2 1 3
> 3 3 1 1
>
>
> I want to select only those agents from above example who have
> switched their b1 status from 0 to 1 in the first two periods (agents
> 2 and 3 above).
One solution is,
1. Make a variable that marks obs. for which time==1 & b1==0.
The variable is equal to 1 if the statement is true, 0 if false.
Call this variable cond1.
2. Make a variable that marks obs. for which time==2 & b1==2.
Call this variable cond2.
3. For each id, make cond1=1 in all obs. if it is true in any obs.
4. For each id, make cond2=1 in all obs. if it is true in any obs.
5. Keep observations for which cond1 & cond2 are true.
The solution is
gen cond1 = (time==1 & b1==0) // (1)
gen cond2 = (time==2 & b1==1) // (2)
sort id // (3)
by id: replace cond1 = sum(cond1)
by id: replace cond1 = cond1[_N]
by id: replace cond2 = sum(cond2) // (4)
by id: replace cond2 = cond2[_N]
keep if cond1 & cond2 // (5)
Here's more concise, equivalent code,
sort id
by id: egen cond1 = max(time==1 & b1==0)
by id: egen cond2 = max(time==2 & b1==1)
keep if cond1 & cond2
I often use the method above. The generic problem is
1. You have long data.
2. You want to choose all the observations for an id
for which a complicated condition is true.
The obvious solution is to switch the data to wide form, but often
it is easier to create the seperate logical variables at the
detailed level and then convert them to 1 everywhere within id
if they are one anywhere.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/