Hello Dr.Cox,
Thanks a lot for pointing the issue on the share variable. I will look into the reference.
Divya.
---- Original message ----
>Date: Tue, 2 Sep 2008 12:57:57 +0100
>From: "Nick Cox" <[email protected]>
>Subject: RE: st: When number of regressors greater than the number of clusters in OLS regression
>To: <[email protected]>
>
>In the back-and-forth with several penetrating comments from Mark
>Schaffer and Steve Samuels one key question was raised by Steve but not
>as far as I can see really answered and another key question was not
>raised at all.
>
>First off, at the risk of being obvious, states for which data are
>available as sampled population seem most unlikely on the face of it to
>be a undistorted sample of the target population, presumably all India.
>My guess would be that various states with no data, say those in remote
>or mountainous areas or politically or militarily sensitive, are also
>often states with low provision. (I'll bet Kashmir or Himachal Pradesh
>is not in the 17, for example.) As your research question seems likely
>to entail extra-statistical inference to all India, it would be vital to
>take account as far as you possibly can of the likely biases. For
>example, you could try to see where the 17 lie in the all-India
>frequency distributions for your predictors or for other
>standard-of-living measures or proxies.
>
>Second, share whether measured as proportion (0-1) or percent (0-100%)
>is bounded and that raises the question, often addressed on this list,
>of whether your modelling should pay direct attention to that. There is
>nothing in standard regression that guarantees predictions for such a
>response within feasible ranges, and worrying econometrics-style about
>how to handle the error term should surely take second place to thinking
>about the best handling of the response variable! At best this may not
>bite much in practice if values are near the middle of the range, 0.5 or
>50%, and vary little. However, a wild guess is that your likely range is
>much larger than that and that values near 0.1 or 0.9 may arise in some
>districts. The problem will be compounded if your project tempts you
>into making out-of-sample predictions for areas where share is expected
>to be low.
>
>Kit Baum recently surveyed the leading options here in a concise and
>highly informative Stata Journal Tip:
>
>SJ-8-2 st0147 . . . . . . . . . . . . . . Stata tip 63: Modeling
>proportions
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.
>F. Baum
> Q2/08 SJ 8(2):299--303 (no
>commands)
> tip on how to model a response variable that appears
> as a proportion or fraction
>
>and, as said, there has been much discussion on the list on how to
>handle proportional responses.
>
>Nick
>[email protected]
>
>Divya Balasubramaniam
>
>Thank you all for your invaluable suggestions. I really appreciate it.
>
>
>*
>* For searches and help try:
>* http://www.stata.com/help.cgi?search
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
=======================================
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/