The step-by-step approach is usually best,
especially if it turns out that you want
something slightly different. But note
that Ian's code can be condensed to
rtrim(substr(v1,1,3)) + subinstr(substr(v1,4,6)," ","0",.))
not that I would necessarily recommend that. Given
a trade-off between clarity and brevity, go
for clarity.
Nick
[email protected]
Ian Watson
> Assuming your data is in a plain text file test.txt, try the
> following:
>
> . insheet using test.txt
> (1 var, 7 obs)
>
> . list
>
> +--------+
> | v1 |
> |--------|
> 1. | B 1 |
> 2. | B 120 |
> 3. | CCH 7 |
> 4. | CCH 23 |
> 5. | CCH213 |
> |--------|
> 6. | UW 23 |
> 7. | UW 232 |
> +--------+
>
> . gen alpha=substr(v1,1,3)
>
> . list alpha
>
> +-------+
> | alpha |
> |-------|
> 1. | B |
> 2. | B |
> 3. | CCH |
> 4. | CCH |
> 5. | CCH |
> |-------|
> 6. | UW |
> 7. | UW |
> +-------+
>
> . replace alpha=rtrim(alpha)
> (4 real changes made)
>
> . list alpha
>
> +-------+
> | alpha |
> |-------|
> 1. | B |
> 2. | B |
> 3. | CCH |
> 4. | CCH |
> 5. | CCH |
> |-------|
> 6. | UW |
> 7. | UW |
> +-------+
>
>
> . gen num=substr(v1,4,6)
>
> . list num
>
> +-----+
> | num |
> |-----|
> 1. | 1 |
> 2. | 120 |
> 3. | 7 |
> 4. | 23 |
> 5. | 213 |
> |-----|
> 6. | 23 |
> 7. | 232 |
> +-----+
>
> . replace num=subinstr(num," ","0",.)
> (4 real changes made)
>
> . list num
>
> +-----+
> | num |
> |-----|
> 1. | 001 |
> 2. | 120 |
> 3. | 007 |
> 4. | 023 |
> 5. | 213 |
> |-----|
> 6. | 023 |
> 7. | 232 |
> +-----+
>
> . gen v2=alpha+num
>
> . list v2
>
> +--------+
> | v2 |
> |--------|
> 1. | B001 |
> 2. | B120 |
> 3. | CCH007 |
> 4. | CCH023 |
> 5. | CCH213 |
> |--------|
> 6. | UW023 |
> 7. | UW232 |
> +--------+
Clare L Maxwell
> > I have a six-column subject identifier coming into Stata
> from an ASCII
> > file that looks like the following:
> >
> > B 1
> > B 120
> > CCH 7
> > CCH 23
> > CCH213
> > UW 23
> > UW 232
> >
> > I want to read in and manipulate these 6 columns so that in
> the end, I
> > have a str6 variable that looks like this:
> >
> > B001
> > B120
> > CCH007
> > CCH023
> > CCH213
> > UW023
> > UW232
> >
> > That is, the first three columns have been right justified
> and the last
> > three columns have been left-padded with zeros. I have
> tried various
> > options, but so far, not much luck. I was considering
> reading them in
> > as six str1 variables and putting things together sort of by brute
> > force, but I thought I'd ask first. Any suggestions on good string
> > manipulations for this problem?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/