Hello,
I would like to advise on using a data dictionary to extract comments
from a .txt file and putting this into a variable.
I wanted to read data from a series of text files, with each file
containing unique file identifiers/descriptors in a header in lines
1-10. All files are named data.txt and I would have to do this
repeatedly and thus, Im looking for a solution that replicable across
files. The data itself is in a tabular format and starts on line 11 and
would look like this:
Spot counts:
1 2 3 4 5 6 7 8 9 10 11 12
A - - - - - - - - - - - -
B - - - - - - - - - - - -
C - - - - - - - - - - - -
D - - - - - - - - - - - -
E - - - - - - - - - - 197 9
F - - - - - - - - - - 188 7
G - - - - - - 3 2 1 204 189 78
H - - - - - - 1 2 0 254 195 63
I have made a do file that can read the tabular data beginning in Line
11 contained in data.txt using infile and a data dictionary. My problem
is that I want to tag each observation with a unique identifier that
would be taken from line 1. This is important because I would
eventually merge all the data from the text files and I would need to
know the source.
The solution I tried seems very clumsy, and it doesn't actually work as
I wanted it to. I created a separate file with a variable (pl)
containing the unique file identifier (the string from Line 1). I then
generated an observation no using generate obsno=_n and saved the file
with the filename as the string from line 1. This file can then be
merged with the tabular data .
- start of do file -
set more off
infile using eli2.dct, clear
drop in 2/l
gen pl=substr(plate,18,6) /* this is data from the first line for
example a string "10u5uc"*/
gen obsno=_n
local file =pl
sort obsno
save `file', replace
infile using eli.dct, clear
gen obsno=_n
sort obsno
merge obsno using 10u5uc, keep(pl) /* How do I do this using a macro
instead of typing "10u5uc"
- end of do file
The resulting merged file then contains 8 observations and a string
variable pl. As expected, variable pl is missing for records 2-8. I
would like variable pl for records 2-8 to have the same value as record
1. I can do this manually:
replace pl ="10u5uc" if _merge==1
but would rather that it were a macro. At this point, I have hit a brick
wall!!!
1. How do I extract the data in Line 1 into a variable and repeat this
for all the records ?
2. Is it possible to merge files and refer to the using dataset using a
macro?
I look forward to your suggestions!
Juan
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/