Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: repeat do-file over multiple files
From
Daniel Bela <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: repeat do-file over multiple files
Date
Sun, 4 Mar 2012 13:48:20 +0100
Dear Allan,
> The same do-file should be run on all 256 files and saved as 256 new
> files.
> And lastly the 256 new files should be combined using append to one
> file.
from my point of view, this seems to be quite cumbersome. In addition to
earlier suggestions, you could achieve this preserving the sort order of
observations the following way:
---- begin statacode ----
clear
cd "/Users/Stata/ProjectX/"
local filelist: dir "." files "*.dta", respectcase
local num=0
local appendlist
/* work on every file and temp-save it */
foreach file of local filelist {
use "`file'"
tempfile file`++num'
display as text in smcl "working on file number {it:`num'}..."
/* you could also -do- an external do-file here; note that this
do file should not -use- or -save- anything, this already happened!
collapse <...>
*/
display as text in smcl "... finished working on file number
{it:`num'}"
save `file`num''
}
/* concatenate files */
forvalues filenum=1/`num' {
if `filenum'==1 use `file`filenum''
else local appendlist: list appendlist | file`filenum'
}
append using `appendlist'
---- end statacode ----
My main point is: It would be more straightforward if you concatenated
the files in the first place, and afterwards did your data preparation;
for example:
---- begin statacode ----
clear
cd "/Users/Stata/ProjectX/"
local filelist: dir "." files "*.dta", respectcase
/* concatenate files */
local firstfile=`"""'+"`: word 1 of `filelist''"+`"""'
local otherfiles: list filelist - firstfile
use `firstfile'
append using `otherfiles', generate(source)
/* you now have a variable "source" identifying groups of
observations from each file;
work on every generated group instead of single files;
note that most other data preparation commands support the
-bysort- prefix, doing the same as by() for collapse
*/
/* you could also -do- an external do-file here; note that this has
to perform every operation by(`source')
collapse <...>, by(source)
*/
drop source
---- end statacode ---
Regards
Bela
--
Daniel Bela
National Educational Panel Study (NEPS)
Data Center
postal address:
Otto-Friedrich-University Bamberg, NEPS
96045 Bamberg
GERMANY
visitor's address:
Otto-Friedrich-Universität Bamberg, NEPS
Wilhelmsplatz 3, Room 112, 96047 Bamberg
phone: +49 951 8633428
facsimile: +49 951 8633405
website: http://www.neps-data.de/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/