Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Nick Cox" <n.j.cox@durham.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: AW: If-condition |
Date | Mon, 5 Jul 2010 11:40:13 +0100 |
Another way to approach this is as follows. It is in the same spirit as Eric's code but a bit simpler. 1. We want to clear out of the way observations with problematic missing values. In this case gen OK = !missing(var1, var2) creates -OK- which is 1 if -var1- and -var2- are not missing and 0 otherwise. 2. Now we want to select the last observation in each panel which is OK. bysort OK idcode (year) : replace OK = OK & _n == _N 3. Then we proceed with our analyses -if OK-. Nick n.j.cox@durham.ac.uk Eric Booth Mareike wrote: > by idcode: g insample = 1 if var1!=. & var2!=. & _n ==_N ... > But if for a certain country the latest observation available is for > example 2005, 'insample' doesn't display a "1" but still a missing value. Sounds like Stata is doing exactly what you ask of it. Your code tells Stata to mark insample==1 within each idcode only if all the conditions are true: var1 and var2 are missing AND the observation is that last one in the panel (not the last NONMISSING observation in a panel--which is what you really want). I'm not sure how to do this in a single line of code, but expanding my previous example, here's one way to do it: ************************ webuse union, clear //setup the example data// keep idcode year age black union south replace year = 1900+year ** ds year idcode, not foreach x in `r(varlist)' { bys idcode: egen ln_mean_`x' = mean(`x') drop `x' } //new: create missing var1 var2 var// forval n = 1/2 { g var`n' = runiform()*1000 **create some missing data in var** qui su var`n', d replace var`n' = . if var`n'<r(p50) } //create "insample" indicator for last obs in panel// sort idcode year by idcode: g insample = 1 if _n ==_N //new: fix insample where var1 and var2 are missing to be last NONMISSING observ// by idcode: g i = 1 if mi(var1) & mi(var2) by idcode: replace insample = 1 if i[_n+1]==1==insample[_n+1] by idcode: replace insample = . if mi(var1) & mi(var2) drop i **check this** li if insample==1 & mi(var1) & mi(var2), clean ta year insample, miss //use "insample" to select the last observation, despite the year// regress ln_mean_age ln_mean_south ln_mean_union ln_mean_black if insample==1 ************************ On Jul 3, 2010, at 3:52 AM, Mareike wrote: > Thanks a lot for your answer. In general, I understand the logic behind your > code and I think it might work for me. > I actually just realized that I need to make the condition a bit more > complex: I need to tell Stata to only work with a specific subsample of the > data, in which two specific variables take non-missing values, and then to > take the latest observation that is available for a certain country in this > restricted sample. > I tried to change your code a bit to do so, but it didn't have the expected > result > ... > by idcode: g insample = 1 if var1!=. & var2!=. & _n ==_N > ... > > Insample correctly takes the value "1" in 2008 for those countries that are > part of the subsample and for which there exists an observation for the year > 2008. But if for a certain country the latest observation available is for > example 2005, 'insample' doesn't display a "1" but still a missing value. Eric Booth > Here's one way to do it: > ************************ > webuse union, clear > > //setup the example data// > keep idcode year age black union south > replace year = 1900+year > ** > ds year idcode, not > foreach x in `r(varlist)' { > bys idcode: egen ln_mean_`x' = mean(`x') > drop `x' > } > > > //create "insample" indicator for last obs in panel// > sort idcode year > by idcode: g insample = 1 if _n ==_N > ta year insample, miss > > //use "insample" to select the last observation, despite the year// > regress ln_mean_age ln_mean_south ln_mean_union ln_mean_black if insample==1 > ************************ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/