Erasmo,
in this STB article (
http://www.stata.com/support/faqs/stat/stb13_rogers.pdf ) it says
that, as long as the largest cluster size is no more than 5% of the
observations, the variance estimator performs reasonably well (based
on some experiments). So, one should roughly have at least 20 clusters
or so. However, I am sure this doesn't hold in all generality.
Maybe you can make a finer distinction of your industries? Like
sub-industries or something? Also, it's worth looking at the
literature in your field to see how other people handle this issue.
Eva
2009/4/15 Erasmo Giambona <[email protected]>:
> Dear Eva and Garry,
>
> Thanks very much for your help. Would that imply by any means that I
> should avoid clustering in my case? Naively, it seems like I am
> shrinking my sample from 304 to only 9 observations. Hope you can
> comment on this.
>
> Regards,
>
> Erasmo
>
>
> On Wed, Apr 15, 2009 at 11:34 AM, Eva Poen <[email protected]> wrote:
>> Erasmo,
>>
>> this depends on the degrees of freedom. With clustering, your df can
>> be greatly reduced. If, for example, you have 7 degrees of freedom
>> (i.e. 8 industries), the p-value will be equal to 2*ttail(7,1.78)
>> which is around 0.118.
>>
>> Hope this helps,
>> Eva
>>
>>
>>
>> 2009/4/15 Erasmo Giambona <[email protected]>:
>>> Dear Statlist,
>>>
>>> I am fitting a model to a cross-sectional data set of 304 firms across
>>> 9 industries. I fit the model using regress with the robust and
>>> cluster options (which I use to cluster standard errors at the
>>> industry level). One of the variables obtains a "t" of 1.78 but its
>>> p-value is "only" 0.114. Shouldn't the p-value be lower than 10% in
>>> this case?
>>>
>>> I really cannot explain this.
>>>
>>> I hope somebody could provide an explanation.
>>>
>>> Thanks,
>>>
>>> Erasmo
>>> *
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/