Hi all,
Thanks so much for your suggestions. Unfortunately, it didn't work for me
after trying several of suggested solutions.
I think the culprit is due to having the id as numeric variable that comes
with 18 digits and stored as double as suggested by Profs Buis and Cox.
Here's the cut and paste of my experimentation:
e.g. the first id = 610102001001010300
the following experimentation did what I after, but only up to the
extraction from the first nine digits (i.e. township variables), after the
9th digits onwards (creating the "censusarea"), I seemed couldn't get what I
want.
Any further insights on how to deal with this in STATA is much appreciated
since I am trying to avoid taking a brute-force approach in Excel/Notepad to
split the variable and bring it back to STATA.
Thanks,
Susan
. gen double city=(floor(id/100000000000000))
. gen double county=(floor(id/1000000000000))
. gen double township=(floor(id/1000000000))
. tab city
city | Freq. Percent Cum.
------------+-----------------------------------
6101 | 68,612 20.18 20.18
6102 | 7,683 2.26 22.43
6103 | 34,610 10.18 32.61
6104 | 47,189 13.88 46.49
6105 | 50,540 14.86 61.35
6106 | 19,706 5.79 67.14
6107 | 32,904 9.68 76.82
6108 | 30,320 8.92 85.73
6109 | 25,838 7.60 93.33
6125 | 22,680 6.67 100.00
------------+-----------------------------------
Total | 340,082 100.00
. tab county
county | Freq. Percent Cum.
------------+-----------------------------------
610102 | 4,699 1.38 1.38
610103 | 6,186 1.82 3.20
610104 | 5,842 1.72 4.92
610111 | 4,708 1.38 6.30
610112 | 4,273 1.26 7.56
610113 | 7,027 2.07 9.63
610114 | 2,283 0.67 10.30
610115 | 6,204 1.82 12.12
610121 | 8,429 2.48 14.60
610122 | 5,600 1.65 16.25
610124 | 5,798 1.70 17.95
610125 | 5,345 1.57 19.52
. tab township in 1/20
township | Freq. Percent Cum.
------------+-----------------------------------
6.10e+08 | 20 100.00 100.00
------------+-----------------------------------
Total | 20 100.00
. tostring township, replace
township was double now str9
. tab township in 1/20
township | Freq. Percent Cum.
------------+-----------------------------------
610102001 | 20 100.00 100.00
------------+-----------------------------------
Total | 20 100.00
. gen double censusarea=(floor(id/1000000))
. des censusarea
storage display value
variable name type format label variable label
----------------------------------------------------------------------------
---
censusarea double %10.0g
. order censusarea
. tostring censusarea, replace
censusarea cannot be converted reversibly; no replace
. list censusarea in 1/20
+-----------+
| censusa~a |
|-----------|
1. | 6.101e+11 |
2. | 6.101e+11 |
3. | 6.101e+11 |
4. | 6.101e+11 |
5. | 6.101e+11 |
|-----------|
6. | 6.101e+11 |
7. | 6.101e+11 |
8. | 6.101e+11 |
9. | 6.101e+11 |
10. | 6.101e+11 |
|-----------|
11. | 6.101e+11 |
12. | 6.101e+11 |
13. | 6.101e+11 |
14. | 6.101e+11 |
15. | 6.101e+11 |
|-----------|
16. | 6.101e+11 |
17. | 6.101e+11 |
18. | 6.101e+11 |
19. | 6.101e+11 |
20. | 6.101e+11 |
+-----------+
.
Date: Sun, 22 Jul 2007 15:30:39 -0700
From: "Susan Olivia" <[email protected]>
Subject: st: Splitting numeric values
Hi,
I have a numeric variable (call it id) that comes with 18 digits and I would
like to create a new variable that extracts from the variable 'id' starting
from the 10th digits and get 4 digits from here.
E.g. my id is given as: 610102001001010300 and I want to create 'newvar'
which has value of 0010.
I know this can be easily done using the 'substr' command, however, I am
having a problem in converting the 'id' into string variable. It gives me
the following command:
***
tostring id, replace
id cannot be converted reversibly; no replace
***
Any advice on how to handle this would be much appreciated.
Thanks,
Susan
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
------------------------------
Date: Sun, 22 Jul 2007 18:51:03 -0400
From: Michael Hanson <[email protected]>
Subject: Re: st: Splitting numeric values
On Jul 22, 2007, at 6:30 PM, Susan Olivia wrote:
> I have a numeric variable (call it id) that comes with 18 digits
> and I would
> like to create a new variable that extracts from the variable 'id'
> starting
> from the 10th digits and get 4 digits from here.
>
> E.g. my id is given as: 610102001001010300 and I want to create
> 'newvar'
> which has value of 0010.
[snip]
> Any advice on how to handle this would be much appreciated.
Wouldn't this work? (Untested)
. gen str4 newvar = substr(string(id),10,4)
Hope this helps.
-- Mike
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/