I am still quite unclear exactly why I do not need to cluster by State at all? Can you kindly explain it one more time to me? Is it because that my dataset is not a sample but accounts for 100% of the population? Or is there something else I need to consider?
so instead of areg Y on X, absorb(state) robust cluster(state); I will now run areg Y on X, absorb(state) robust
correct?
Also can someone explain the inference of individual coefficients estimates when we encounter this kind of problem in case OLS regression (with lesser # cluster than the # regressors)
Thanks,
Divya.
---- Original message ----
>Date: Mon, 1 Sep 2008 16:59:59 -0400
>From: Steven Samuels <[email protected]>
>Subject: Re: st: When number of regressors greater than the number of clusters in OLS regression
>To: [email protected]
>
>Divya-
>
>So, you have n = 436. Just remove State as a cluster variable and
>continue with your modeling. You won't be troubled by the limit on
>regressors again; just keep the number to <=44 (10% of observations).
>
>Good luck!
>
>-Steven Samuels
>
>On Sep 1, 2008, at 4:22 PM, Divya Balasubramaniam wrote:
>
>> Hello Dr. Steven,
>>
>> My dependent variable is:share of total number of households in a
>> district having access to tap water. (I have the district totals)
>>
>> Divya.
>> =======================================
>> Divya Balasubramaniam
>> Economics PhD Student
>> Terry College of Business
>> University of Georgia
>> Athens -30602.
>>
>> From: Steven Samuels <[email protected]>
>> Date: September 1, 2008 4:13:40 PM EDT
>> To: [email protected]
>> Subject: Re: st: When number of regressors greater than the number
>> of clusters in OLS regression
>> Reply-To: [email protected]
>>
>>
>> Divya,
>> I reread your question and realize that you probably do not have
>> sample data at all. The Census of India was not a sample at all,
>> but, ideally, was a 100% enumeration. (Just as in other countries,
>> this will not be perfectly true.) So, I am not sure that you should
>> be clustering on State, or even on district, for that matter.
>> Please reply with details about your observations. For example, do
>> you have information on individual households or just district totals?
>>
>> Regards,
>>
>> Steven
>>
>>
>> On Sep 1, 2008, at 1:05 PM, Steven Samuels wrote:
>>
>>> More basic questions, Divya: What is your target population: the
>>> 17 states (of India, perhaps?) or the entire country? Were the 17
>>> states selected from all states by a sampling process? Or were
>>> they chosen in some other way--for example, because they had data
>>> available. Are all districts from the selected states in your
>>> sample?
>>>
>>>
>>> -Steven
>>> On Sep 1, 2008, at 12:35 PM, Divya Balasubramaniam wrote:
>>>
>>>> Dear Dr.Schaffer,
>>>>
>>>> I am using clustering in my analysis and I am having some trouble
>>>> understanding some of the important issues. I have read several
>>>> papers you have written on clustering issues and hence I am
>>>> emailing you to seek help.
>>>>
>>>> I am doing a district level analysis for the census year 2001. I
>>>> have 436 districts in total coming from 17 States. I run an OLS
>>>> regression of Share of households having tap water access on
>>>> several controls variables (I have about 25 Regressors). I use
>>>> the STATA command areg Y on X, absorb(State) cluster(state). I
>>>> have the state fixed effects and clustered by State.
>>>>
>>>> My question is: I have more regresors(25) than the number of
>>>> clusters(17). I also find in the STATA output that I have F-stat
>>>> missing. I would like to seek your advice on whether I can make
>>>> inference by looking at the individual coefficient estimates and
>>>> the reported robust Standard errors. I did see your comment on
>>>> this issue on the STATA listserv. However, I could not find
>>>> answers as to how to fix this problem of having more regressors
>>>> than the number of clusters.
>>>>
>>>> I will be extremely thankful if you can kindly help me in this
>>>> regard.
>>>> Sincerely,
>>>> Divya.
>>>> =======================================
>>>> Divya Balasubramaniam
>>>> Economics PhD Student
>>>> Terry College of Business
>>>> University of Georgia
>>>> Athens -30602.
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/statalist/faq
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>
>*
>* For searches and help try:
>* http://www.stata.com/help.cgi?search
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
=======================================
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/