Vera Troeger <[email protected]> asked,
> I want to do a Monte Carlo experiment and need to generate a
> pseudo-population that has a panel structure (tscs). how can I generate a
> random variable x_it with i cross-sections and t timeperiods?
Let's distingish between two models,
Y_it = X1_i*b1 + X2_t*b2 + X3_it*b3 + u_i + u_t + u_ij (1)
and
Y_it = X1_i*b1 + X3_it*b3 + u_i + u_ij (2)
For most of the simulations I have done, (2) is good enough, so let me start
there and then move to (1).
Model 2
-------
The basic outline for creating a model-2 dataset is to create a
cross-sectional dataset (one obs. per i), fill in X1_i and u_i, then -expand-
the dataset (so that there are, say, 5*i obs.), and fill in the rest.
For instance, say we want to create a dataset of 500 panels (i=1, 2, ..., 500)
and 10 time periods (t=1, 2, ..., 10):
. drop _all
. set obs 50
. gen i = _n
. gen x1 = uniform()
. gen u_i = 2*invnorm(uniform())
. expand 10
. sort i
. by i: gen t = _n
. gen x3 = uniform()
. gen u_it = 3*invnorm(uniform())
. gen y = x1*1 + x3*2 + u_i + u_it
There are lots of variations on the above; you may want to have multiple
x1 and/or x3 variables and you may want them correlated, but in all cases,
the basic idea is the same. Make a cross-sectional dataset, fill it in,
and then add the time-series details.
Model 1
-------
Simulating the full model is just a little more difficult than simulating
model 2.
The way to proceed is, prior to making the cross-sectional dataset, make
a time-series dataset. Then following the outline for model (2). At
the end, -merge- the time-series dataset you previously constructed.
Here's how to make the time-series dataset:
. drop _all
. set obs 10
. gen t = _n
Now we can generate X2_t variables and the u_t variable.
Often, you will want to make X2_t follow a process, such as
X2_t = constant + alpha*X2_t-1 + noise
or perhaps X2_t is a function of t, as well. Anyway,
. gen x2 = .
. replace x2 = 1 in 1
. replace x2 =
. gen x2 = 4 + .2*x2[_n-1] + 2*invnorm(uniform())
Sometimes a simple u_t is all that is necessary
. gen u_t = invnorm(uniform())
and sometimes you will want to put a process on that, too. Anyway, make the
x2 and u_t variables. Once ou have the time-series dataset, sort it by t and
save it:
. sort t
. save ts, replace
Now make the cross-sectional dataset,
. drop _all
. set obs 50
. gen i = _n
. gen x1 = uniform()
. gen u_i = 2*invnorm(uniform())
use -expand- to convert the cross-sectional dataset into a panel, and
generate t,
. expand 10
. sort i
. by i: gen t = _n
and now, here is the new part: merge in ts.dta previously created:
. sort t
. merge t using ts
. sort i t
Now you can create y and do whatever else you need. For instance, perhaps
you want unbalanced panels. Then drop some of the observations.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/