-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
--- Paulo Santos wrote:
> I have a dataset with the form
>
> ID Age gender Asset1 Asset2 Plot Area value_production value_input
> 1 20 1 . 10 1 . 5 2
> 1 20 1 . 10 2 2 4 1
>
> where "." are missing values. I would like to end up with a dataset with
> just 1 record per ID, with individual characteristics, including land area
> (the sum of the are of the plots) and non-land wealth (the sum of the
> value of the other assets). That is, for example:
>
> ID Age gender wealth land
> 1 20 1 15 4.5
>
> I was planning to use multiple imputation and -mvis- to "fill" the missing
> values of the area of some plots and of the value of some assets but the
> problem would be that I would end up with different values of, for
> example, asset1 for individual 1, as every observation is taken to be
> independent - and that doesn't make much sense.
I assume that the multiple observations for the same individual are
multiple observations over plots. Some of these variables don't change
(apparently asset1 is such a constant variable) and others do
(apparently value_production and value_input). I would first change the
data into wide format using -reshape- and than call -ice- (a newer
version or -mvis- to do the imputations. This way you impute Asset1
only once per individual per `complete dataset' so no inconsistencies
can occur. Earlier today I posted an example on how to use -reshape-,
see http://www.stata.com/statalist/archive/2007-03/msg01021.html .
Hope this helps,
Maarten
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/