Dear Stata gurus:
I have data that comes to me in a text file that is generally columnar, but
does not contain any delimiters. If I import the file into Stata it comes
in as a single variable v1 ; the first 10 observations look something like
this (assuming the email doesn't mangle it here):
. list v1
+----------------------------------------------------------------------
| v1
|----------------------------------------------------------------------
1. | ANY 1206 Bunk1 Corn Silage $0.00 97 FarmGrown Forag lb
2. | 1234 Shed1 Grass Hay $0.00 146 Purchased Forag lb
3. | 1582 Purch1 Straw $0.00 164 Purchased Forag lb
4. | 1237 Shor1 Haylage 1st cut $0.00 149 FarmGrown Forag lb
5. | 1238 Bunk4 Haylage 2nd cut $0.00 150 FarmGrown Forag lb
|----------------------------------------------------------------------
6. | 1070 CornGrainGrndFine $135.00 205 Purchased Energ lb
7. | 039 BakeryByProdBread $130.00 308 Purchased Energ lb
8. | 1052 BeetPulpPlCp $140.00 313 Purchased Energ lb
9. | 1022 EnergyBooster $1,200. 521 Purchased Energ lb
10. | 00
Note that in some columns there is text for only some observations; a
section which is "names" has variable numbers of "words" in each name, etc.
When I try to infile this file using a dictionary file, I am not able to
because there are embedded blanks where all the spaces are. I also tried to
slice out substrings or words using string functions, but the embedded
spaces cause different "columns" to be picked up in different observations.
I have created a convoluted routine that first strips out the embedded
blanks using the -itrim- function (which is missing from the [D]manual
section on string functions) Then I work in from either end using a series
of operations such as:
g v4 = word (v3,-1)
replace v1 = subinstr (v3, word (v3, -1), "",1)
Is there a way to get these columns of data into variables in a .dta file
more easily, or is there a way to convert the embedded blanks to characters
or " " spaces so I could operate on them more easily?
I could do it by putting it through my text editor, but I ultimately need
the import to be automated in a .do file, so I need a procedure that avoids
having to manually deal with it.
Thanks for any insights
Buzz
Buzz Burhans, Ph.D.
Dairy-Tech Group
Phone: 802-755-6842
Cell: 802-388-7214
Email: [email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/