The effect of your -egen, group()- is
to lump all the missings on -county-
and/or -household- together. In cases
where -household- is missing but not
-county-, or vice versa, that throws
away some information.
-egen, group() missing- will do a bit
better.
But the reconstruction of missing data
seems somewhere between difficult and
impossible, on least on the information
you provide.
For example, suppose
you have -county- but not -household-.
There seem two possibilities. The
household is in fact one of the other
households in the same county in
your dataset, or it is not. Do you
have any grounds to say which is correct?
Conversely, suppose you have -household-
but -county-. It may be that your numbering
system will enable you to reconstruct the
-county-.
Finally, suppose you have neither -household-
nor -county-. If there is a method for
imputing, it must be based on the other variables.
Nick
[email protected]
Alexander Nervedi
>
> I have panel data with gaps. After tssfill, full i have a
> complete data that
> but there are many covariates, some string and some numeric,
> that become
> complete but are actually not. For example.
>
> egen uid = group(county household)
> tsset uid year
> tsfill, full
>
>
> will generate missing values for county and household to fill
> in the gaps,
> even though uid and year are complete. what is a good way to
> fill in missing
> observations for variables like county and household ?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/