The step-by-step approach is usually best,
especially if it turns out that you want
something slightly different. But note
that Ian's code can be condensed to
rtrim(substr(v1,1,3)) + subinstr(substr(v1,4,6)," ","0",.))
not that I would necessarily recommend that. Given
a trade-off between clarity and brevity, go
for clarity.
Nick
[email protected]
Ian Watson
Assuming your data is in a plain text file test.txt, try the
following:
. insheet using test.txt
(1 var, 7 obs)
. list
+--------+
| v1 |
|--------|
1. | B 1 |
2. | B 120 |
3. | CCH 7 |
4. | CCH 23 |
5. | CCH213 |
|--------|
6. | UW 23 |
7. | UW 232 |
+--------+
. gen alpha=substr(v1,1,3)
. list alpha
+-------+
| alpha |
|-------|
1. | B |
2. | B |
3. | CCH |
4. | CCH |
5. | CCH |
|-------|
6. | UW |
7. | UW |
+-------+
. replace alpha=rtrim(alpha)
(4 real changes made)
. list alpha
+-------+
| alpha |
|-------|
1. | B |
2. | B |
3. | CCH |
4. | CCH |
5. | CCH |
|-------|
6. | UW |
7. | UW |
+-------+
. gen num=substr(v1,4,6)
. list num
+-----+
| num |
|-----|
1. | 1 |
2. | 120 |
3. | 7 |
4. | 23 |
5. | 213 |
|-----|
6. | 23 |
7. | 232 |
+-----+
. replace num=subinstr(num," ","0",.)
(4 real changes made)
. list num
+-----+
| num |
|-----|
1. | 001 |
2. | 120 |
3. | 007 |
4. | 023 |
5. | 213 |
|-----|
6. | 023 |
7. | 232 |
+-----+
. gen v2=alpha+num
. list v2
+--------+
| v2 |
|--------|
1. | B001 |
2. | B120 |
3. | CCH007 |
4. | CCH023 |
5. | CCH213 |
|--------|
6. | UW023 |
7. | UW232 |
+--------+
Clare L Maxwell
> I have a six-column subject identifier coming into Stata
from an ASCII
> file that looks like the following:
>
> B 1
> B 120
> CCH 7
> CCH 23
> CCH213
> UW 23
> UW 232
>
> I want to read in and manipulate these 6 columns so that in
the end, I
> have a str6 variable that looks like this:
>
> B001
> B120
> CCH007
> CCH023
> CCH213
> UW023
> UW232
>
> That is, the first three columns have been right justified
and the last
> three columns have been left-padded with zeros. I have
tried various
> options, but so far, not much luck. I was considering
reading them in
> as six str1 variables and putting things together sort of by brute
> force, but I thought I'd ask first. Any suggestions on good string
> manipulations for this problem?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/