Since your example mentions diag9201 and diag98201 as referring to diagnosis groups 9 and 98,
you might want to use the ? wildcard as this replaces a single unknown character.
e.g.
egen diag98 = rsum(diag9???)
Will match diag9201 but not diag98201.
Richard
*--------------------------------------------------------
Richard Atkins
London School of Hygiene and Tropical Medicine
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/