Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: managing changing variable names, types over multiple files
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: managing changing variable names, types over multiple files
Date
Fri, 10 Jun 2011 19:26:48 +0000
<>
Paul:
-descsave- from SSC would be useful for storing the variable names and other attributes (see the help file).
As far as looping over files, take a look at -help extended_fcn-.
Here's an example of what I think you're describing:
*********!
sysuse auto, clear
forval n = 1/9 {
sa testdata200`n', replace
}
clear
sa masterlist.dta, emptyok replace
global files: dir "`c(pwd)'" files "testdata*.dta", nofail respectcase
foreach f of global files {
u `f', clear
descsave, sa(autodesc.dta, replace) //could save a do() here too
g filename = "`f'"
order filename
sa desc_`f'.dta , replace
u masterlist.dta, clear
append using desc_`f'.dta
sa masterlist.dta, replace
}
u masterlist.dta, clear
desc
*********!
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
On Jun 10, 2011, at 1:54 PM, Paul Burkander wrote:
> Hi all,
>
> I'm working with data that cover several years, with a separate file
> for each year. Unfortunately, the names and types of variables
> sometimes change from year to year, making it difficult to append all
> the files. There are a large number of variables, so it's difficult
> to check for changes by hand. Also, we'll be getting more years in
> the future, so I'd like to, as much as possible, automate a system
> that catalogs variable names and types.
>
> I'm envisioning a system where we have a macro with the names of all
> the files, then loop over each file, capture all the variable names
> and types, and dump it into a master variable attributes file. I'm
> imagining a different variable for each row/attribute, so there'd be a
> 2007varname and a 2008vartype, for instance. There would also be a
> mastervarname for what we want the variable to me. Each row would
> correspond to the variable whose name may or may not change over time.
>
> Does this seem like a reasonable way to automate this? Do any of you
> have any other ideas? are there any user written programs that can
> aid in this process?
>
> I'd greatly appreciate any suggestions!
>
> Paul
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/