Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Bootstrapping with unbalanced panel
From
Heath Henderson <[email protected]>
To
[email protected]
Subject
Re: st: Re: Bootstrapping with unbalanced panel
Date
Mon, 5 Aug 2013 13:47:37 -0400
Thank you for the response Nick. Any help is much appreciated. Each
cluster here represents an individual observed at multiple points in
time. Thus in this data set idcode and year uniquely identifies the
observations.
Consider the draw of one cluster (e.g. idcode==1), which yields seven
observations in this case, one for each time period. This cluster, I
believe, is then attributed newid (e.g. newid==1) and newid and year
will uniquely identify those seven observations.
Now since we are sampling with replacement, consider what happens when
the same cluster is resampled. We will then have seven more
observations, which will be attributed newid as well (e.g. newid==2).
For these two clusters then, newid and year should uniquely identify
observations. Unless I am misunderstanding something (which I probably
am), newid and year should then uniquely identify all observations for
each bootstrapped sample, right?
Any clarification would be great, especially as to how to constrain
sampling by year as you suggested. Thanks!
Heath
On Mon, Aug 5, 2013 at 12:13 PM, Nick Cox <[email protected]> wrote:
> Your -bsample- command did not constrain sampling of -year- so
> duplicates on -newid year- don't seem surprising to me.
>
> Nick
> [email protected]
>
>
> On 5 August 2013 14:54, Heath Henderson <[email protected]> wrote:
>
>> In a previous exchange "bootstrapping with unbalanced panel" a method for
>> preserving the structure of unbalanced panel data while bootstrapping was
>> discussed. The idea is to cluster by the firm/household/individual
>> identifier variable and stratify by a variable that counts the number of
>> times a given entity is observed. Here is an example of that approach:
>>
>> webuse union.dta
>> bysort idcode (year): gen byte panelsize = _N
>> bsample, cluster(idcode) idcluster(newid) strata(panelsize)
>>
>> The approach is conceptually straightforward and after drawing the sample
>> we would expect newid and the time variable to uniquely identify
>> observations. This, however, doesn't seem to be the case as xtset yields
>> the following error:
>>
>> . xtset newid year
>> repeated time values within panel
>> r(451);
>>
>> What might be going on here? Why wouldn't newid and year uniquely identify
>> observations? If it matters, I am using Stata 12 SE for Mac. Any help would
>> be much appreciated.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
Heath Henderson, PhD
Research Fellow
Inter-American Development Bank
1300 New York Avenue NW
Washington, DC 20577
Office: (202) 623-3860
Cell: (612) 867-4776
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/