And in the meantime, the world marches on. In my jurisdiction we have
been using ICD10* for over 5 years.
In the past I have parsed the ICD9 file using regular expressions,
relying on the fact that the formatting of chapters, categories and
subcategories all followed a similar format. I suspect one could do
this using Stata, in fact, using the new regexm() and low level file
io commands. For those short descriptions, one can usually get an
idea from the adjacent code descriptions and their pattern of
abbreviations.
* ICD-10 states that "… in the interests of international
comparability, no changes should be made in the content (as indicated
by the titles) of the three-character categories and the
four-character sub-categories of the Tenth Revision … except as
authorized by WHO…. WHO should be promptly notified about the
intention to produce translations and adaptations or other ICD
-related classifications." In an effort to enforce this position,
ICD-10 was the first of the ICD revisions to be copyrighted which may
explain the slow uptake in the US, where ICD9 was heavily modified for
various purposes involving coding for re-imbursement.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/