Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: AW: If-condition
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: AW: If-condition
Date
Mon, 5 Jul 2010 11:40:13 +0100
Another way to approach this is as follows. It is in the same spirit as
Eric's code but a bit simpler.
1. We want to clear out of the way observations with problematic missing
values. In this case
gen OK = !missing(var1, var2)
creates -OK- which is 1 if -var1- and -var2- are not missing and 0
otherwise.
2. Now we want to select the last observation in each panel which is OK.
bysort OK idcode (year) : replace OK = OK & _n == _N
3. Then we proceed with our analyses -if OK-.
Nick
[email protected]
Eric Booth
Mareike wrote:
> by idcode: g insample = 1 if var1!=. & var2!=. & _n ==_N
...
> But if for a certain country the latest observation available is for
> example 2005, 'insample' doesn't display a "1" but still a missing
value.
Sounds like Stata is doing exactly what you ask of it. Your code tells
Stata to mark insample==1 within each idcode only if all the conditions
are true:
var1 and var2 are missing AND the observation is that last one in the
panel (not the last NONMISSING observation in a panel--which is what you
really want).
I'm not sure how to do this in a single line of code, but expanding my
previous example, here's one way to do it:
************************
webuse union, clear
//setup the example data//
keep idcode year age black union south
replace year = 1900+year
**
ds year idcode, not
foreach x in `r(varlist)' {
bys idcode: egen ln_mean_`x' = mean(`x')
drop `x'
}
//new: create missing var1 var2 var//
forval n = 1/2 {
g var`n' = runiform()*1000
**create some missing data in var**
qui su var`n', d
replace var`n' = . if var`n'<r(p50)
}
//create "insample" indicator for last obs in panel//
sort idcode year
by idcode: g insample = 1 if _n ==_N
//new: fix insample where var1 and var2 are missing to be last
NONMISSING observ//
by idcode: g i = 1 if mi(var1) & mi(var2)
by idcode: replace insample = 1 if i[_n+1]==1==insample[_n+1]
by idcode: replace insample = . if mi(var1) & mi(var2)
drop i
**check this**
li if insample==1 & mi(var1) & mi(var2), clean
ta year insample, miss
//use "insample" to select the last observation, despite the year//
regress ln_mean_age ln_mean_south ln_mean_union ln_mean_black if
insample==1
************************
On Jul 3, 2010, at 3:52 AM, Mareike wrote:
> Thanks a lot for your answer. In general, I understand the logic
behind your
> code and I think it might work for me.
> I actually just realized that I need to make the condition a bit more
> complex: I need to tell Stata to only work with a specific subsample
of the
> data, in which two specific variables take non-missing values, and
then to
> take the latest observation that is available for a certain country in
this
> restricted sample.
> I tried to change your code a bit to do so, but it didn't have the
expected
> result
> ...
> by idcode: g insample = 1 if var1!=. & var2!=. & _n ==_N
> ...
>
> Insample correctly takes the value "1" in 2008 for those countries
that are
> part of the subsample and for which there exists an observation for
the year
> 2008. But if for a certain country the latest observation available is
for
> example 2005, 'insample' doesn't display a "1" but still a missing
value.
Eric Booth
> Here's one way to do it:
> ************************
> webuse union, clear
>
> //setup the example data//
> keep idcode year age black union south
> replace year = 1900+year
> **
> ds year idcode, not
> foreach x in `r(varlist)' {
> bys idcode: egen ln_mean_`x' = mean(`x')
> drop `x'
> }
>
>
> //create "insample" indicator for last obs in panel//
> sort idcode year
> by idcode: g insample = 1 if _n ==_N
> ta year insample, miss
>
> //use "insample" to select the last observation, despite the year//
> regress ln_mean_age ln_mean_south ln_mean_union ln_mean_black if
insample==1
> ************************
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/