
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: When number of regressors greaterthan the number of clusters in OLS regression

From   Divya Balasubramaniam <[email protected]>
To   [email protected]
Subject   Re: st: When number of regressors greaterthan the number of clusters in OLS regression
Date   Mon, 1 Sep 2008 17:26:28 -0400 (EDT)

I am still quite unclear exactly why I do not need to cluster by State at all? Can you kindly explain it one more time to me? Is it because that my dataset is not a sample but accounts for 100% of the population? Or is there something else I need to consider?

so instead of areg Y on X, absorb(state) robust cluster(state); I will now run areg Y on X, absorb(state) robust

Also can someone explain the inference of individual coefficients estimates when we encounter this kind of problem in case OLS regression (with lesser # cluster than the # regressors)


---- Original message ----
>Date: Mon, 1 Sep 2008 16:59:59 -0400
>From: Steven Samuels <[email protected]>  
>Subject: Re: st: When number of regressors greater than the number of clusters in OLS regression  
>To: [email protected]
>So, you have n = 436. Just remove State as a cluster variable and  
>continue with your modeling. You won't be troubled by the limit on  
>regressors again; just keep the number to <=44 (10% of observations).
>Good luck!
>-Steven Samuels
>On Sep 1, 2008, at 4:22 PM, Divya Balasubramaniam wrote:
>> Hello Dr. Steven,
>> My dependent variable is:share of total number of households in a  
>> district having access to tap water. (I have the district totals)
>> Divya.
>> =======================================
>> Divya Balasubramaniam
>> Economics PhD Student
>> Terry College of Business
>> University of Georgia
>> Athens -30602.
>> From: Steven Samuels <[email protected]>
>> Date: September 1, 2008 4:13:40 PM EDT
>> To: [email protected]
>> Subject: Re: st: When number of regressors greater than the number  
>> of clusters in OLS regression
>> Reply-To: [email protected]
>> Divya,
>> I reread your question and realize that you probably do not have  
>> sample data at all. The Census of India was not a sample at all,  
>> but, ideally, was a 100% enumeration. (Just as in other countries,  
>> this will not be perfectly true.) So, I am not sure that you should  
>> be clustering on State, or even on district, for that matter.  
>> Please reply with details about your observations. For example, do  
>> you have information on individual households or just district totals?
>> Regards,
>> Steven
>> On Sep 1, 2008, at 1:05 PM, Steven Samuels wrote:
>>> More basic questions, Divya:  What is your target population:  the  
>>> 17 states (of India, perhaps?) or the entire country?  Were the 17  
>>> states selected from all states by a sampling process?  Or were  
>>> they chosen in some other way--for example, because they had data  
>>> available.  Are all districts from the selected states in your  
>>> sample?
>>> -Steven
>>> On Sep 1, 2008, at 12:35 PM, Divya Balasubramaniam wrote:
>>>> Dear Dr.Schaffer,
>>>> I am using clustering in my analysis and I am having some trouble  
>>>> understanding some of the important issues. I have read several  
>>>> papers you have written on clustering issues and hence I am  
>>>> emailing you to seek help.
>>>> I am doing a district level analysis for the census year 2001. I  
>>>> have 436 districts in total coming from 17 States. I run an OLS  
>>>> regression of Share of households having tap water access on  
>>>> several controls variables (I have about 25 Regressors).  I use  
>>>> the STATA command areg Y on X, absorb(State) cluster(state). I  
>>>> have the state fixed effects and clustered by State.
>>>> My question is: I have more regresors(25) than the number of  
>>>> clusters(17). I also find in the STATA output that I have F-stat  
>>>> missing. I would like to seek your advice on whether I can make  
>>>> inference by looking at the individual coefficient estimates and  
>>>> the reported robust Standard errors. I did see your comment on  
>>>> this issue on the STATA listserv. However, I could not find  
>>>> answers as to how to fix this problem of having more regressors  
>>>> than the number of clusters.
>>>> I will be extremely thankful if you can kindly help me in this  
>>>> regard.
>>>> Sincerely,
>>>> Divya.
>>>> =======================================
>>>> Divya Balasubramaniam
>>>> Economics PhD Student
>>>> Terry College of Business
>>>> University of Georgia
>>>> Athens -30602.
>>>> *
>>>> *   For searches and help try:
>>>> *
>>>> *
>>>> *
>>> *
>>> *   For searches and help try:
>>> *
>>> *
>>> *
>> *
>> *   For searches and help try:
>> *
>> *
>> *
>*   For searches and help try:
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index