Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | gemini mtei <gjmt_99@yahoo.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: predicting consumption |
Date | Wed, 9 Mar 2011 09:17:13 -0800 (PST) |
Thanks for the suggestion Joerg, I have tried the glm-gamma option but the estimated coefficient am getting a almost similar to the one i get using OLS. These give similar prediction as OLS. G. --- On Wed, 9/3/11, Joerg Luedicke <joerg.luedicke@gmail.com> wrote: > From: Joerg Luedicke <joerg.luedicke@gmail.com> > Subject: Re: st: predicting consumption > To: statalist@hsphsun2.harvard.edu > Date: Wednesday, 9 March, 2011, 16:34 > On Wed, Mar 9, 2011 at 11:03 AM, > gemini mtei <gjmt_99@yahoo.com> > wrote: > > I am trying to predict household total consumption > from the national household budget survey to a small survey > that we conducted but didn't collect consumption. I have > used a linear model (OLS) as follow, > > > > log(consumption)= B0 +B1wealth+B2log(household size) + > B3wealth*log(household size) +B4wealth*location, where > > > > wealth is measured by asset index constructed from > ownership of assets, housing characteristics, source of > utilities, and household head specific characteristics (i.e. > education and employment). Location captures urban-rural > differences. > > > > The model is giving me R-square of .55 and i have done > all diagnostic tests and it seems fine. I have used the > split half method for validation of the predicted > consumption but (i.e. selecting a random sample from the > households survey, run consumption model and predict into > the remaining sample then compare with actual consumption) > the problem i am facing is the model over predicts > consumption for the households with low consumption while it > under predict for households with higher consumption. > > > > I need the predicted consumption for the analysis of > out of pocket financing incidence in the small survey i > mentioned above. These survey had small difference in their > implimentation time and the assumption i am putting is that > since the household budget survey is nationally > representative i can use it to predict consumption into this > small survey. Can you advise whether i am making mistake > in model specification? Is there a special case in > predicting with interactions? > > > > I am merely guessing but an OLS model might not be the > right choice. > The fact that: > > Quote: > "i am facing is the model over predicts consumption for > the > households with low consumption while it under predict for > households > with higher consumption" > > seems to indicate that other distributions may be of better > fit. Have > you checked anything beyond OLS? For instance, using a > gamma glm: > > glm consumption [your indep vars] , family(gamma) > link(log) > > would be a better fit? > > J. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/