Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Small sample with clustered data
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Small sample with clustered data
Date
Tue, 29 Nov 2011 06:19:24 -0500
Lars <[email protected]>:
You can estimate the bias in the SE via simulation of data just like
yours where you control the correlations and actual treatment effects;
if the rejection rate of a nominal 1% test is in the 5% range for some
coefficients, and tests for other coefficients (vars with no
clustering) have correct size, perhaps you just use a higher standard
of "significance" for some coefs than others. You will have to run
millions or at least hundreds of thousands of simulations, though,
which will take some time... faster to just caveat "significance of
results must interpreted with caution" with OIM, het-robust, or
cluster-robust SEs.
On Mon, Nov 28, 2011 at 4:14 PM, <[email protected]> wrote:
> Dear Austin,
>
> thank you for your reply. If I understand you correct,
> you suggest to use cluster(countryid) after the regression, while
> controlling for euclus. Countryid is a number from 1 to 50. This works.
> The results are the same as if I use the robust option after the regression.
> So do you think this is the best option and I should state that SE are
> probably biased downward and thus significant results have to be interpreted with caution?
> What if the coefficients are still significant even though I do not use the cluster option? Is there a way
> to estimate the bias?
>
> Best
>
> Lars
>
>
> -----Ursprüngliche Nachricht-----
> Von: "Austin Nichols" <[email protected]>
> Gesendet: 28.11.2011 20:00:41
> An: [email protected]
> Betreff: Re: st: Small sample with clustered data
>
>>Lars <[email protected]>:
>>You are likely to have SEs biased downward no matter what you do, if
>>you use the 24 cluster design--can you cluster by country (50
>>clusters) but include eucluster as an explanatory variable?
>>
>>On Mon, Nov 28, 2011 at 6:24 AM, <[email protected]> wrote:
>>> Dear Statalist,
>>>
>>> My sample consists of 50 countries with 26 of them being EU Member States.
>>> The problem is that the values of the dependent variable for the EU members are not
>>> independent of each other. Thus, I created a dummy variable "eucluster" that indicates
>>> if a country is in the EU (1=yes; 0=no) and used the cluster(eucluster) option after the
>>> OLS Regressions in Stata 10. However, in "Clustered Errors in Stata"
>>> (Nichols/Schaffer 2007 -http://repec.org/usug2007/crse.pdf) it is mentioned that if M,
>>> the number of clusters, is small matters could even get worse by using the cluster option (Sheet 20).
>>> M=50 seems to be the minimum number of clusters required.
>>>
>>> I have 24 clusters consisting of 1 country and 1 cluster comprising 26 EU members (6 independent variables).
>>> I do not know how to deal "correctly" with these clustered data in Stata. Hence, I would highly appreciate if someone could
>>> give me advice or suggest a solution on how to deal with the clustered data in such a small sample.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/