Dear All,
Working with large datasets, I've found a problem regarding observations
id: my originals are way too long (say, strings 20 of the form
"PROVINCE-CITY-HOUSEHOLD..."). The id variable only is sometimes half of
my file. Generating numerical ids (as explained in a very useful FAQ by
N. Cox) is useful, but then I sometimes have problems with the rounding
of numbers (since I have ids from 1 to, say, 16 millions).
I thought about a solution which uses strings but is more compact than
my original, which is storing numerical ids as strings in hexadecimal
notation. I've found a discussion by W. Gould on this list, but this
referred basically as hex as a form of displaying numbers (from a FAQ:
"Stata also provides a special %21x format that shows the exact value in
a special hexadecimal format").
I was wondering how I can go from a float (numerical id) to a compact
string showing the hexadecimal value (perhaps even more compact than the
%21x format since I only have positive integers). There might also be
the problem of loss of precision in the conversion, and of course I need
to avoid that.
I guess my question boils down to converting a variable from its value
to a string with its displayed value.
Thank you very much for any suggestions
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/