[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: "skipping" missing data

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: AW: "skipping" missing data
Date	Fri, 07 Aug 2009 08:51:33 -0500

This looks good. You could shorten it by using -egen, total()- but inpractice that puts Stata to more work, although in many datasets youwould hardly notice.


One of many alternatives, given here solely to show more technique, is

regress bperwk_ wk
bysort ptnum : egen nbobs = total(e(sample))
egen newgroup = group(ptnum) if nbobs > 1

The prior regress is not the one you want, but it's a way of taggingobservations you want, as observations with any missings will not end upas part of the estimation sample, and so will be 0 on e(sample). Thatholds for any bundle of numeric variables.


Nick

B. Timothy Walsh wrote:

Dear Nick,

Many thanks--I had to modify your suggested code a little, to reflectthe fact that it isn't missing rows but missing datapoints within rowsthat is causing the problem. I came across your very helpful tutorial reby: in _Stata Journal_ 2(1):86-102 (2002) which provided the additionalguidance.

Here's the code that does work. I'm happy to have suggestions for makingit more elegant, if you have any. Again, many thanks.

Tim
------------------------------------------------
bysort ptnum : gen int nbobs = sum(bperwk_ < .)
bysort ptnum : replace nbobs = nbobs[_N]

egen newgroup = group(ptnum) if nbobs > 1
summarize newgroup, meanonly

forval i = 1/`r(max)' {
     regress bperwk_ wk if newgroup == `i'
     predict p
     replace p1=p if newgroup == `i'
     drop p
}
--------------------------------------

--On Thursday, August 06, 2009 12:55 PM -0500 Nick Cox<[email protected]> wrote:

Singleton panels are tagged as such by

bysort ptnum : gen allonmyown = _N == 1

Alternatively, panels with two or more are tagged as such by

bysort ptnum : gen twoormore = _N > 1

after which you can go

egen group = group(ptnum) if !missing(bperwk_, wk) & !allonmyown

OR

egen group = group(ptnum) if !missing(bperwk_, wk) & twoormore

B. Timothy Walsh wrote:

Thank you: this worked very nicely.
EXCEPT I now realize I also have instances in which there is only a
single data point for an individual. Is there a simple way to modify
this line?
egen group = group(ptnum) if !missing(bperwk_, wk)


  > --On Thursday, August 06, 2009 11:28 AM -0500 Nick Cox

Here is one of several alternatives.

generate p1=.
egen group = group(ptnum) if !missing(bperwk_, wk)
summarize group, meanonly

forval i = 1/`r(max)' {
      regress bperwk_ wk if group == `i'
      predict p
      replace p1=p if group == `i'
      drop p
}

That sets the missings on one side.

See also:

FAQ     . . . . . . . . . . Making foreach go through all values of a
variable

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. N.

J. Cox
         8/05    Is there a way to tell Stata to try all values of a
                 particular variable in a foreach statement without
                 specifying them?
                 http://www.stata.com/support/faqs/data/foreach.html

Despite the reference to -foreach- the FAQ is still pertinent.

Nick

Martin Weiss wrote:

*************
capture
*************

You could put it in front of individual commands, or the entire
-forvalues- loop.


B. Timothy Walsh


I am attempting to generate predictions from regressions performed for
each  of a longish list of individuals. The problem is that, for some
individuals, there are no dependent variable data (entries are
missing), so  the regression attempt fails. The problem is that the
forvalues loop then  exits. I would like to somehow "skip" these
individuals. Loop seems to work  fine if there are enough data to
perform a regression. I'd be grateful for  any suggestions.

Here's the code:
generate p1=.
forvalues i = 1/50 {        //50 individuals
    regress bperwk_ wk if ptnum == `i'
    predict p
    replace p1=p if ptnum == `i'
    drop p
}

I'm pretty much a Stata novice. So, I apologize if I am missing
something  obvious. Using version 10.1.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: "skipping" missing data
  - From: "B. Timothy Walsh" <[email protected]>
- st: AW: "skipping" missing data
  - From: "Martin Weiss" <[email protected]>
- Re: st: AW: "skipping" missing data
  - From: Nick Cox <[email protected]>
- Re: st: AW: "skipping" missing data
  - From: "B. Timothy Walsh" <[email protected]>
- Re: st: AW: "skipping" missing data
  - From: Nick Cox <[email protected]>
- Re: st: AW: "skipping" missing data
  - From: "B. Timothy Walsh" <[email protected]>

Prev by Date: st: Sample selection in bivariate probit.
Next by Date: Re: st: Standard normal Depvar
Previous by thread: Re: st: AW: "skipping" missing data
Next by thread: Re: st: "skipping" missing data
Index(es):
- Date
- Thread