|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: infile and dictionaries and the small data mindset
One of our worst fears is that someone will come to us with data
scattered all over a spreadsheet file in little summary tables. If they
have lots of those files, I can usually find a way to script the import
efficiently using ODBC.
What if you have those same tables in a text file? Is there any
efficient way to import and parse data in such a format? I have the far
end of this process scripted so the researcher can generate his own
summary statistics, but getting the data into Stata involves a program
making an excel file, followed by cutting and pasting into Stata. I'd
like to cut out some of the import steps, so that all we would need to
do is give a list of filenames to a Stata script, and watch the screen
roll by as the data get extracted.
The files are generated by a program that is monitoring mouse behavior.
Each file may contain behavior from one mouse on one day, or several
mice on one day. The general format is always the same. For each
mouse-run, there is a small block of ancillary information as a header.
I cannot guarantee that all of these blocks have the same number of
words, but some of that info will be needed as data. These are followed
by blocks of numbers in columns. Each block has an alphanumeric header
before it (on its own line), and there are row numbers.
I would have a fairly good idea how to script this in Matlab, but I
don't want to be the one doing the import on a daily basis, and it's
hard for the researcher to justify buying into some pricey software just
to script that one task.
Any clues about scripting this type of import in Stata would be appreciated.
Thanks
Paul
--
E. Paul Wileyto, Ph.D.
Assistant Professor of Biostatistics
Tobacco Use Research Center
School of Medicine, U. of Pennsylvania
3535 Market Street, Suite 4100
Philadelphia, PA 19104-3309
215-746-7147
Fax: 215-746-7140
[email protected]
http://mail.med.upenn.edu/~epw/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/