Tom Boonen wants to write a utility that allows the user to specify a
set of units, time periods and variables, and produce a Stata matrix
of the resulting rows and columns of a panel-format dataset. Scott
presented a nice solution for that -- but how is Tom going to use it?
Stata doesn't perform statistical analysis on Stata matrices.
The unnecessary part here, it seems to me, is to subset the dataset
for a particular set of variables. This could be done with preserve
and restore, but that tends to be really slow on a big panel dataset.
Here is my take on the solution:
program drop _all
program foo2 , rclass
version 9.2
syntax /* varlist(ts min=2 numeric) */, Timex(numlist >=0 integer)
Units(numlist integer) Gen(string)
qui tsset
local tvar `r(timevar)'
local pvar "`r(panelvar)'"
local tx: subinstr local timex " " ",", all
local u: subinstr local units " " ",", all
gen `gen' = ( inlist(`tvar',`tx') & inlist(`pvar', `u'))
end
use http://fmwww.bc.edu/ec-p/data/wooldridge2k/CORNWELL,clear
tsset county year
foo2, timex(83(2)87) units(7 9 11 13 19 21) gen(mysamp1)
xtdes if mysamp1
reg avgsen polpc density taxpc if mysamp1
This approach does nothing with variables, but allows you to carry
out any number of analyses on the designated subsample of units and
time periods just by appending the if condition that identifies those
observations. No need to fool around with the variables. If Tom
really wants to create a subset dataset, then reinstate the variable
list (per Scott's code) and have the routiine
preserve
keep (only those variables, plus the panel identifiers)
keep if `gen'
save newdatasetname
restore
after the gen `gen' statement.
Kit Baum, Boston College Economics
http://ideas.repec.org/e/pba1.html
An Introduction to Modern Econometrics Using Stata:
http://www.stata-press.com/books/imeus.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/