Thanks for the clarification, which looks like
a correction to me, e.g. for "from 0 to 1"
read "from 1 to 0".
It wastes your time -- and for it's worth
other people's -- if you say what you don't mean
and mean what you don't say.
In your previous posting you discussed the
variable -sa-, sexually active, in terms "is sexually
active", "no longer sexually active", which prompted
my comments.
Now it appears that -sa- means "ever sexually active?",
which manifestly difers.
Could you please check these postings more carefully
before sending them off and confusing readers?
Nick
[email protected]
Scott Cunningham
> Thanks for the response. I'm going over it carefully, but I wanted
> to quickly clarify something. The contradictions that I'm worried
> about are not going from 0 to 1, but rather going from 1 to 0
> - which
> is impossible, given that the nature of the event I'm describing
> (e.g., did the person ever have vaginal intercourse with a member of
> the opposite sex). This would be straightforward if there was just
> one question to appeal to, but unfortunately, the way the NLSY97 is
> set up, that simple question is asked in a variety of different ways
> to the 9000 different respondents, depending on their answers
> to many
> other questions.
>
> I'm reading more closely your recommendations now. Just wanted to
> clarify that point about the contradiction.
> On Oct 11, 2006, at 2:57 PM, Nick Cox wrote:
> >> 1. I am occasionally worried that I am replacing variables with
> >> values that are incorrect. In this example, it is easy to find
> >> contradictions, though. If someone is sexually active in
> an earlier
> >> wave (say 1997) but then later reports that they are no longer
> >> sexually active (say 2002), then it would mean the person
> >> reported he
> >> was not a virgin in 1997 but is a virgin in 2002. How do others of
> >> you check to make sure you do not have mistakes like this
> - once you
> >> have already reshaped the data into a panel, for instance? I
> >> think I
> >> do not possess enough of these checks in my programming, in
> >> fact, and
> >> am making many mistakes along the way that I'm not catching.
> >
> > I don't want to start a discussion on Statalist on quite what
> > is virginity, but unfortunately you seem to need to define exactly
> > what _you_ understand by it. I don't regard your example here
> > as contradictory at all as long as virgin means here "not
> > sexually active". Alternatively, if a person was ever previously
> > sexually active, I do not see how they can revert to being
> > a virgin (barring some legalistic redefinition).
> >
> > More generally, you can check for correctness if you independently
> > have correct answers or have some rule that guesses correct
> > answers for you (e.g. a majority vote). I don't see either here.
> >
> >> 3. Finally, sexual activity has holes, as I said, which if
> >> there are
> >> no contradictions (like going from 0 to 1 over time), can be
> >> corrected by filling all missing observations with a 0 or 1,
> >> assuming
> >> the first time a 1 appears is truly the first year the person made
> >> their sexual debut. What is the best way to fill in a
> missing value
> >> in the context of this type of duration modeling? I need to tell
> >> Stata to make all missing observations a 0, unless a 1 had appeared
> >> at some point earlier, in which case replace with a 1.
> >
> > Again, going from 0 to 1 over time does not seem
> contradictory to me.
> >
> > The maximum of -sa- seen so far is just
> >
> > gen max_sa_sofar = .
> > bysort id (year) : replace max_sa_sofar = max(sa,
> max_sa_sofar[_n-1])
> >
> > The way that the -max()- function works is that -max(0,.)- is 0, -
> > max(1,.)-
> > is 1, etc., so that the usual rule that . is arbitrarily large
> > is set aside. (This is a feature not a bug.)
> >
> > This principle is implemented in the -egen- function -record()- from
> > -egenmore- on SSC, attributable to Kit Baum and S.B. Else.
> >
> > Thus you just need to copy across from this -max_sa_sofar- variable
> > whenever -sa- is missing. That still leaves open for discussion
> > whether
> > this method of imputation is socially or sexually valid, as I doubt.
> >
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/