I have a household income survey data ( 38,000 observations), and my
problem is doing a multiple regression on saving ( independent var) to
ethnicity/strata/employment
etc( dependent var).
The problem is this : 70% of my observation for the value of saving is
zero. I had recode it to 1 and log them, but the distribution is still
extremely skewed ( mean 0.78, std dev is 2.4 min 0 max 14). The
historgam still looks like the letter L , exteremly skewed to the
right with long tail. Obviously, OLS is out, and I tried Poisson(
glm nbinomial) but the distribution is still not distributed normally.
The data are in order i.e no missing values etc etc. It is clean.For
some reason, lobit would not run.
What do you suggest? Thank you so much.
Muhammed.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/