Wade T Roberts posed a problem to which various
people offered solutions, and indeed, there's more
than one way to do it. I add some extra comments
on each solution.
> I have a single numeric variable that identifies the
> city/county/state for
> each case, where the first three digits represent the
> city, the next two
> the county, and the final two represent the state.
>
> Examples:
>
> 223406
> 1453209
> 2785845
> etc...
>
> I'm only interested in identifying cases by state at the
> moment. How do I
> go about singling out this part of the data, or creating
> a new identifying state variable?
Marcela Perticara
> I don�t know if there is an specific command for this, but
> one way would be
> to
>
> gen statecode=full_id-int(full_id/100)*100
>
> where full_id is the original variable. I guess it should
> work as long as
> your state code is always in the last two digit of
> your original variable.
In a similar vein, the last two digits are
mod(full_id,100)
Note that the last two digit 06 will map to 6.
If you wanted the explicit zero, you would need
something like
string(mod(full_id,100), "%02.0f")
where the result is a string and the numeric
format %02.0f insures a leading zero whenever
the result would otherwise be a single digit.
Fernando Lozano
> gen newvar=string(oldvar)
> gen state=substr(newvar,n1,n2)
> where n1 is the first digit of the variable to appear on state and
n2 is
> the last digit. For example:
> if newvar(i)=1453209
> then gen state(i)=substr(newvar,5,7) will generate state(i)=09
The main idea is fine, but a few details are wrong here.
n2 is not the last digit, but the
(maximum) length of the substring. Subscripts are given within []
and cannot be supplied on the left of the = sign.
Daniel R. Sabath
> Since you really are not using the state variable as a numeric,
convert it
> to a string.
> tostring geocode, generate(str_geocode)
> Then use the string processing functions to get what you want. In
this case
gen state = substr(str_geocode,-2,2) /* -2 is 2 from the right side
> for 2 characters */
> gen county = substr(str_geocode,-4,2)
> The only problem you have is where the city code is less than 100.
> This pads the string out to 7 characters if it only has 6.
> replace str_geocode = "0" + str_geocode if length(str_geocode) == 6
> Then
> gen city = substr(str_geocode,1,3)
> More information can be had by typing "help substr" which will bring
up help
> on all the string functions.
An alternative is just to use -string()-, as Fernardo suggested.
Daniel's idea can then be re-expressed this way:
gen state = substr(string(geocode),-2,2)
gen county = substr(string(geocode),-4,2)
gen city = substr(string(geocode, "%07.0f"),1,3)
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/