Babigumira Ronnie
> I have data on household expenditure. Households bought "x" kgs (quan) of
> food crops (exp) worth a certain amount of money (unitvalu). However, I
> have cases where the quantity bought is missing (either because the
> household couldn't recall or an error in data collection) however the
> amount spent by these households is known. The variables are
>
> lc1code housecode exp quan unitvalu
> 11233 112331 566 1 500
>
> I would like to replace the missing quantities purchased with community
> (lc1code) averages. If the lc1code, food item (exp), unitvalu are the same
> then we can deduce the quantity (quan) that can be purchased by that
> amount of money. What I now want to do is to replace all missing
> quantities with a value imputed from community averages. To make it more
> clear
>
> If we know that 500/= buys 1kg of cassava in a given community, then a
> respondent in the community who spends 500/= on cassava should
> automatically be purchasing 1kg.
>
> I want to write a code that would automatically execute this for all
> missing cases however, I can't figure out where to start. I would
> appreciate any help.
Nick Winter has made suggestions using -collapse-.
An alternative would be to use -egen-.
Form groups based on your three variables:
egen mean = mean(quan), by(lc1code exp unitvalu)
replace quan = mean if missing(quan)
However, if this still leaves missing values,
you might want to go on to a more general regression
approach.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/