HealthMaps (Richard Hoskins)
>
> I have string variables with codes (1 million records - the
> database works
> very well in Stata (100+ MB)
>
> CAUSE (a string variable) AGE ... more variables
> A391 71
> A483 32
> A985 45
> B222 85
> B230 91
> etc...
>
>
> a sample of the codes are: (taken from a Proc Format
> statement in SAS) There are thousands of them. And in any
> one database there
> will be a subset of these codes. They are shortened about
> as much as they
> can be, there are subtle distinctions in some codes for
> closely related
> diseases.
>
> 'A391'='Waterhouse Friderichsen syndrome'
> 'A483'='Toxic shock syndrome'
> 'A985'='Hemorrhagic fever w renal syndrome'
> 'B222'='HIV dis wasting syndrome'
> 'D814'='Nezelofs syndrome'
> 'D820'='Wiskott Aldrich syndrome'
> 'D821'='Di Georges syndrome'
> 'D824'='Hyperimmunoglobulin E [IgE] syndrome'
>
> Then when I want to do a frequency table or most anything
> else where the
> code will be displayed I want to see the value label for
> the code instead
>
> I do not want to see
>
> CAUSE frequencies or whatever
> A391
> A483
> D814
> D820
> D821
> D824
>
> on my output but:
> CAUSE frequencies or whatever
> Waterhouse Friderichsen syndrome
> Toxic shock syndrome
> Hemorrhagic fever w renal syndrome
> Nezelofs syndrome
> Wiskott Aldrich syndrome
> Di Georges syndrome
> Hyperimmunoglobulin E [IgE] syndrome
>
>
> Clearly this is an extreme case, but it comes up with Zip
> Codes or other
> geographic identifiers that appear as strings, often with
> periods in them
> (like US census tracts) from a GIS. Can't use these as
> numbers and I need to
> link value labels to them (like the name of the Zip Code)
>
Understood. So you can store these
as two string variables. Somewhere you
can presumably find a file with all the definitions,
and you can then -merge- this with your data set.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/