[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: dealing with multiple alphanumeric responses

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: RE: dealing with multiple alphanumeric responses
Date	Tue, 2 Jul 2002 08:51:26 +0100

Pooja Gupta wrote
> 
> > one of my
> > variables has multiple alphanumeric characters that are not 
> > seperated by commas.
> > for eg, the first five observations of the variable are
> > 
> > 1. ABC
> > 2. ABCEG
> > 3. BDEGHI
> > 4. ACDFGI
> > 5. AHI
> > 
> > can a write a code which allows me to do a tabulation of each 
> > of these alphabets
> > (i.e., how many As, how many B, how many C and so on) ?

and Tom Steichen suggested 
> 
> Something of the form
> 
> . for any A B C D E F G H I: gen v_X=index(var, "X") \ replace 
> v_X=1 if v_X>1 
> 
> where A B C D E F G H I is the list of possible alpha characters
>   and var is the variable of interest
> 
> will generate individual numeric (0,1) variables for each alpha code
> that can then be tabulated with the usual tabulation commands.
> 
> Tom
> 

There's a small slip in Tom's code here. 

He meant 

. for any A B C D E F G H I: gen v_X=index(var, "X") \ replace 
v_X=1 if v_X>0 

because otherwise all occurrences in the first column will 
be ignored. In fact, his code can be telescoped: 

. for any A B C D E F G H I: gen v_X=index(var, "X") > 0  

That still leaves several variables, which as said can be 
tabulated one by one, but you might want something more 
compact. 

Here's another way to approach it. I assume string variable 
-v-. 

1. -save- the data set if not already saved. 

2. -trim()- any spaces: 

replace v = trim(v) 

3. calculate the length of each string: 

gen l = length(v) 

4. record obs number 

gen long obs = _n 

5. -expand- using -l- 

expand l 

6. -sort- and take each character 

bysort obs: gen str1 char = substr(v,_n,1) 

7. -tabulate- results 

tab char 

8. -save- this data set if needed in future 

9. return to original data set 

Nick 
[email protected] 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: RE: dealing with multiple alphanumeric responses [correction]
  - From: "Nick Cox" <[email protected]>
- st: RE: RE: RE: dealing with multiple alphanumeric responses
  - From: "Nick Cox" <[email protected]>

References:
- st: RE: dealing with multiple alphanumeric responses
  - From: "Steichen, Thomas" <[email protected]>

Prev by Date: st: RE: merge...
Next by Date: st: RE: RE: RE: dealing with multiple alphanumeric responses
Previous by thread: st: RE: dealing with multiple alphanumeric responses
Next by thread: st: RE: RE: RE: dealing with multiple alphanumeric responses
Index(es):
- Date
- Thread