Sarah said:
>
> I'm recoding a substantial number of text responses into categorical
> variables. I'm finding it easier to -encode- the variables with the
> text responses first, before replacing the categorical variables with
> the correct value - this way I can avoid typing out all the text
> responses in the -replace- command and just type their encoded numbers.
> I have done this for 6 variables, and it worked fine for 5 of them. I
> cannot figure out what went wrong with the 6th.
>
> The variable I am trying to encode has about 90 categories. When I
> encode though, the resulting variable I generate begins at number 8 and
> ends at 238. The first category (text response) gets an 8, the second
> gets a 12, and so forth. The manual states that -encode- alphabetizes
> before it encodes, but that doesn't explain my problem. I would still
> expect the numbers to go sequentially, which they have with the other 5
> variables.
I cannot say definitely without looking at your data whats happening,
but two things to check out in the original text values:
1) Trim them, e.g. -replace var6=trim(var6)-
This can get rid of leading and trailing spaces which will throw things off.
2) Check for (what amount to) missing values in the text field, i.e.
"", " ", " ", etc
I know that when -encode- has gone non-apha for me, its because of spaces.
good luck,
Dan
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/