| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Re: Analysis of text
Dear Michael,
Thank you for the suggestions and questions.
The problem is that I don't have much more detail at the moment as the
files aren't handed over to me yet and the background documentation is
almost missing. So I posted my original question to get first
impressions whether I have to turn to a programmer for a specialised
program or could I try to resolve it on my own.
Said that I can still answer some of your questions. The file structure
is described to me as "free text, like an essay". The files are computer
generated so the structure should be uniform. There are multiple lines,
at least one per each question. The number of questions per person
varies between 300 and 500 (overall number of variables ~700). Missing
variables are left empty, but I don't know if the question numbers are
shown or the "empty" means that those are omitted as well.
Best regards,
Taavi
Michael Blasnik wrote:
Yes, it can be done. You should really post more details if you want
more help. How are the files structured? Are there multiple lines in
each? How many variables? How consistent is the layout across
files? How are missing values coded? etc.. Why don't you just show
a section of a file?
Without more info, I'd guess that you will most likely use the -file-
command to read in the data (but infix may work, perhaps with
-split-). Looping across files and accumulating results is fairly easy.
M Blasnik
----- Original Message ----- From: "Taavi Lai" <[email protected]>
To: <[email protected]>
Sent: Friday, December 08, 2006 1:56 PM
Subject: st: Analysis of text
Dear statalisters,
I have a set (~10 000) of text files, each containing questionnaire
answers for one person. These files are structured as plain text
files without any tabular form.
I�d like to loop through all these files and generate one datafile as
a result. Say, "M1" represents a question number and the value for
this question is thus the text/number between "M1" and "M2".
Could such a thing be done using Stata? Any suggestions and comments
are welcome.
Best regards,
Taavi
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/