|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: AW: "skipping" missing data
This looks good. You could shorten it by using -egen, total()- but in
practice that puts Stata to more work, although in many datasets you
would hardly notice.
One of many alternatives, given here solely to show more technique, is
regress bperwk_ wk
bysort ptnum : egen nbobs = total(e(sample))
egen newgroup = group(ptnum) if nbobs > 1
The prior regress is not the one you want, but it's a way of tagging
observations you want, as observations with any missings will not end up
as part of the estimation sample, and so will be 0 on e(sample). That
holds for any bundle of numeric variables.
Nick
B. Timothy Walsh wrote:
Dear Nick,
Many thanks--I had to modify your suggested code a little, to reflect
the fact that it isn't missing rows but missing datapoints within rows
that is causing the problem. I came across your very helpful tutorial re
by: in _Stata Journal_ 2(1):86-102 (2002) which provided the additional
guidance.
Here's the code that does work. I'm happy to have suggestions for making
it more elegant, if you have any. Again, many thanks.
Tim
------------------------------------------------
bysort ptnum : gen int nbobs = sum(bperwk_ < .)
bysort ptnum : replace nbobs = nbobs[_N]
egen newgroup = group(ptnum) if nbobs > 1
summarize newgroup, meanonly
forval i = 1/`r(max)' {
regress bperwk_ wk if newgroup == `i'
predict p
replace p1=p if newgroup == `i'
drop p
}
--------------------------------------
--On Thursday, August 06, 2009 12:55 PM -0500 Nick Cox
<[email protected]> wrote:
Singleton panels are tagged as such by
bysort ptnum : gen allonmyown = _N == 1
Alternatively, panels with two or more are tagged as such by
bysort ptnum : gen twoormore = _N > 1
after which you can go
egen group = group(ptnum) if !missing(bperwk_, wk) & !allonmyown
OR
egen group = group(ptnum) if !missing(bperwk_, wk) & twoormore
B. Timothy Walsh wrote:
Thank you: this worked very nicely.
EXCEPT I now realize I also have instances in which there is only a
single data point for an individual. Is there a simple way to modify
this line?
egen group = group(ptnum) if !missing(bperwk_, wk)
> --On Thursday, August 06, 2009 11:28 AM -0500 Nick Cox
Here is one of several alternatives.
generate p1=.
egen group = group(ptnum) if !missing(bperwk_, wk)
summarize group, meanonly
forval i = 1/`r(max)' {
regress bperwk_ wk if group == `i'
predict p
replace p1=p if group == `i'
drop p
}
That sets the missings on one side.
See also:
FAQ . . . . . . . . . . Making foreach go through all values of a
variable
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. N.
J. Cox
8/05 Is there a way to tell Stata to try all values of a
particular variable in a foreach statement without
specifying them?
http://www.stata.com/support/faqs/data/foreach.html
Despite the reference to -foreach- the FAQ is still pertinent.
Nick
Martin Weiss wrote:
*************
capture
*************
You could put it in front of individual commands, or the entire
-forvalues- loop.
B. Timothy Walsh
I am attempting to generate predictions from regressions performed for
each of a longish list of individuals. The problem is that, for some
individuals, there are no dependent variable data (entries are
missing), so the regression attempt fails. The problem is that the
forvalues loop then exits. I would like to somehow "skip" these
individuals. Loop seems to work fine if there are enough data to
perform a regression. I'd be grateful for any suggestions.
Here's the code:
generate p1=.
forvalues i = 1/50 { //50 individuals
regress bperwk_ wk if ptnum == `i'
predict p
replace p1=p if ptnum == `i'
drop p
}
I'm pretty much a Stata novice. So, I apologize if I am missing
something obvious. Using version 10.1.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/