Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Bootstrapping with unbalanced panel
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: Re: Bootstrapping with unbalanced panel
Date
Mon, 05 Aug 2013 15:57:49 -0400
Yes, there seems to be a bug - it looks like -bsample- is creating -newid-s that are unique within strata, but not
across strata:
. webuse union, clear
(NLS Women 14-24 in 1968)
. bysort idcode (year): gen byte panelsize = _N
. bsample, cluster(idcode) idcluster(newid) strata(panelsize)
. tab panelsize newid if newid<6 // Just look at the first few -newids-
| Bootstrap sample cluster id
panelsize | 1 2 3 4 5 | Total
-----------+-------------------------------------------------------+----------
1 | 1 1 1 1 1 | 5
2 | 2 2 2 2 2 | 10
3 | 3 3 3 3 3 | 15
4 | 4 4 4 4 4 | 20
5 | 5 5 5 5 5 | 25
6 | 6 6 6 6 6 | 30
7 | 7 7 7 7 7 | 35
8 | 8 8 8 8 8 | 40
9 | 9 9 9 9 9 | 45
10 | 10 10 10 10 10 | 50
11 | 11 11 11 11 11 | 55
12 | 12 12 12 12 12 | 60
-----------+-------------------------------------------------------+----------
Total | 78 78 78 78 78 | 390
You can get around this by making a "new" newid
. egen newnewid=group(newid panelsize)
. xtset newnewid year
panel variable: newnewid (unbalanced)
time variable: year, 70 to 88, but with gaps
delta: 1 unit
hth,
Jeph
On 8/5/2013 9:54 AM, Heath Henderson wrote:
Hello everybody,
In a previous exchange "bootstrapping with unbalanced panel" a method for
preserving the structure of unbalanced panel data while bootstrapping was
discussed. The idea is to cluster by the firm/household/individual
identifier variable and stratify by a variable that counts the number of
times a given entity is observed. Here is an example of that approach:
webuse union.dta
bysort idcode (year): gen byte panelsize = _N
bsample, cluster(idcode) idcluster(newid) strata(panelsize)
The approach is conceptually straightforward and after drawing the sample
we would expect newid and the time variable to uniquely identify
observations. This, however, doesn't seem to be the case as xtset yields
the following error:
. xtset newid year
repeated time values within panel
r(451);
What might be going on here? Why wouldn't newid and year uniquely identify
observations? If it matters, I am using STATA 12 SE for Mac. Any help would
be much appreciated. Thanks.
Heath Henderson
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/