Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: reading a txt file that loops
From
Mike Lacy <[email protected]>
To
[email protected]
Subject
st: Re: reading a txt file that loops
Date
Sun, 17 Apr 2011 09:45:32 -0600
[email protected] wrote
>Are there any shortcuts to reading a data file that has the following format
>other than to reorganize the data before importing?
Here's a simple approach with ordinary Stata machinery. The general
idea is to read each line of the data as a string and then lightly
massage it into something that can be -insheet-ed in the form
"FIPS, y1, y2, y3, y4, locname, decstart"
I presume the data has the following form and is in "loopdata.csv"
FIPS 1990 1980 1970 1960
00000 248709873 226545805 203211926 179323175 United States
18000 5544159 5490224 5193669 4662498 Indiana
18001 31095 29619 26871 24643 Adams County
18003 300836 294335 280455 232196 Allen County
18005 63657 65088 57022 48198 Bartholomew County
FIPS 1950 1940 1930 1920
00000 151325798 132164569 12320262 106021537 United States
18000 3934224 3427796 3238503 2930390 Indiana
18001 22393 21254 19957 20503 Adams County
18003 183722 155084 146743 114303 Allen County
18005 36108 28276 24864 23887 Bartholomew County
//
//clear
insheet using loopdata.csv // each line of the data becomes a string in v1
replace v1 = itrim(v1) //multiple spaces are a nuisance
// Mark cases according to starting decade
gen decstart = (word(v1,2)) if (strpos(v1, "FIPS") > 0)
replace decstart = decstart[_n-1] if missing(decstart)
// We want something comma delimited, with only one header line with
variable names.
replace v1 = subinstr(v1, " ", ",", 5) // 5 is unique to this data, of course
drop if ((strpos(v, "FIPS") > 0) & (_n > 1))
replace v1 = "FIPS, y1, y2, y3, y4, locname, decstart" if _n ==1
replace v1 = v1 + ", " + decstart if _n > 1
drop decstart
// Save data file as a csv, then reimport
tempfile temp
outsheet using `temp', nonames noquote
clear
insheet using `temp', names comma
// A long data structure makes sense, I'm guessing
reshape long y, i(fips decstart) j(year)
replace year = decstart - (year-1) * 10
Regards,
=-=-=-=-=-=-=-=-=-=-=-=-=
Mike Lacy, Assoc. Prof.
Soc. Dept., Colo. State. Univ.
Fort Collins CO 80523 USA
(970)-491-6721
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/