I am working on a very large data set that has a left-truncation problem -
specifically, incorrect founding dates for a fair number of firms, meaning
that I cannot really determine the age of these particular firms. In
attempt to address this problem, I am exploring the possibility of dropping
the incorrect dates and using multiple imputation procedures to estimate
new dates. Because of confidentiality constraints associated with the use
of the aforementioned data set, I am limited to testing possible solutions
to this problem using a surrogate but structural similar data set. My
question is about making an adjustment to the surrogate data set. The
surrogate data set contains about 3500 cases, some of which have just one
record per case, and others have multiple records. Each record has a
variable recording an entering time and an exit time for that record.
Hence, single record cases have one entry time and one exit time; multiple
record cases have multiple entry and multiple exit times. I would like to
convert the first entry time for every fifth multiple-record case in this
data set into a missing value. Are there any suggestions for how this might
be done. Thanks.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/