These identifiers should be read in and maintained as string variables.
Then you can extract the first three digits using e.g.
-substr(str_zipcode, 1, 3)-. Here -str_zipcode- is a string variable
version of a numeric -zipcode-.
Retrospective surgery to get a string variable with leading zeros from a
numeric variable with integer values is
gen str_zipcode = substr("0000000", 1, 7 - length(zipcode)) +
string(zipcode)
or (more elegantly)
gen str_zipcode = string(zipcode, "%07.0f")
However, note that unless your variable was read in as a long or double
you may have lost accuracy in final digits, so re-doing the input and
insisting on string variables is likely to be much the safest way.
Nick
[email protected]
Ekaterina Hertog
1) I have got a variable consisting of 7 digit zipcodes. I want to
create a second variable which will only consist of the first 3 digits
of each zip-code and I cannot find a way to do it.
2) Some of the zipcodes start with 00, e.g 0037845 and Stata drops the
front 00 turning such zipcodes into 5-digit numbers (e.g. 37845). I need
to make Stata understand that these 00 are meaningful and return them
back into the zipcodes and I cannot find how to do this. They were
present in my original csv file, but when I converted it into a Stata
file using Stattransfer they were gone.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/