Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Bootstrapping with unbalanced panel
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Re: Bootstrapping with unbalanced panel
Date
Mon, 5 Aug 2013 19:32:20 +0100
Sorry, I didn't read your previous carefully enough. I can't follow
what is going on. Perhaps you could fill in your reference to
previous exchange "bootstrapping with unbalanced panel"
so that people who do this might comment.
Nick
[email protected]
On 5 August 2013 18:47, Heath Henderson <[email protected]> wrote:
> Thank you for the response Nick. Any help is much appreciated. Each
> cluster here represents an individual observed at multiple points in
> time. Thus in this data set idcode and year uniquely identifies the
> observations.
>
> Consider the draw of one cluster (e.g. idcode==1), which yields seven
> observations in this case, one for each time period. This cluster, I
> believe, is then attributed newid (e.g. newid==1) and newid and year
> will uniquely identify those seven observations.
>
> Now since we are sampling with replacement, consider what happens when
> the same cluster is resampled. We will then have seven more
> observations, which will be attributed newid as well (e.g. newid==2).
> For these two clusters then, newid and year should uniquely identify
> observations. Unless I am misunderstanding something (which I probably
> am), newid and year should then uniquely identify all observations for
> each bootstrapped sample, right?
>
> Any clarification would be great, especially as to how to constrain
> sampling by year as you suggested. Thanks!
>
> Heath
>
> On Mon, Aug 5, 2013 at 12:13 PM, Nick Cox <[email protected]> wrote:
>> Your -bsample- command did not constrain sampling of -year- so
>> duplicates on -newid year- don't seem surprising to me.
>>
>> Nick
>> [email protected]
>>
>>
>> On 5 August 2013 14:54, Heath Henderson <[email protected]> wrote:
>>
>>> In a previous exchange "bootstrapping with unbalanced panel" a method for
>>> preserving the structure of unbalanced panel data while bootstrapping was
>>> discussed. The idea is to cluster by the firm/household/individual
>>> identifier variable and stratify by a variable that counts the number of
>>> times a given entity is observed. Here is an example of that approach:
>>>
>>> webuse union.dta
>>> bysort idcode (year): gen byte panelsize = _N
>>> bsample, cluster(idcode) idcluster(newid) strata(panelsize)
>>>
>>> The approach is conceptually straightforward and after drawing the sample
>>> we would expect newid and the time variable to uniquely identify
>>> observations. This, however, doesn't seem to be the case as xtset yields
>>> the following error:
>>>
>>> . xtset newid year
>>> repeated time values within panel
>>> r(451);
>>>
>>> What might be going on here? Why wouldn't newid and year uniquely identify
>>> observations? If it matters, I am using Stata 12 SE for Mac. Any help would
>>> be much appreciated.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/