[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Categorizing HIV status using a series of string variables

From	"Tom Trikalinos" <[email protected]>
To	[email protected]
Subject	Re: st: Categorizing HIV status using a series of string variables
Date	Tue, 25 Nov 2008 07:13:09 -0500

howard posted a complete solution. I apologize for not reading
chelsea's e-mail carefully.  glad it worked out in the end.

On Mon, Nov 24, 2008 at 9:17 PM, Howard Lempel <[email protected]> wrote:
> Chelsea,
>
> I haven't used regular expressions in a bit, so someone should correct me if I'm wrong, but I think the problem you mention would be solved by replacing Tom's first line of code with:
>
> gen group = (regexm(HIV, "N[\.I]*P"))
>
> The expression in the brackets tells Stata to ignore "."s and "I"s when it looks for an N followed by a P.  For more info on using regular expressions, check here: http://www.stata.com/support/faqs/data/regex.html.  -help regexm- will also be useful.
>
> Also, I don't know if you have any cases where someone is indeterminate or missing in every period.  If you have any such cases, I think Tom's code will code those as group 3, which does not seem appropriate. You may want to add a line of code as follows:
>
> replace group = 4 if !regexm(HIV, "N") & !regexm(HIV, "P")
>
> You will also want value labels for your variable.  The following code should -label- your group variable.
>
> lab define grouplab 1 "incident seroconverter" 2 "prevalent positive" ///
>        3 "consistently seronegative" 4 "missing/indeterminate"
>
> lab val group grouplab
>
> Lastly, I'd like to warn you that I've had some trouble with the way that Stata's -regexm- function deals with missing values.  If you have any truly missing values (i.e. ""), I would carefully check to make sure that the -regexm- function is dealing with them in the right way.  See this thread for the problem I had:
> http://www.stata.com/statalist/archive/2008-10/msg00935.html
>
> Hope this helps
> Howie
>
> Howie Lempel
> Research Assistant
> The Brookings Institution | Economic Studies
>
> 1775 Massachusetts Ave NW | Washington DC 20036
> [email protected] | p: (202) 238-3576
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Polis, Chelsea B.
> Sent: Monday, November 24, 2008 8:49 PM
> To: [email protected]
> Subject: RE: st: Categorizing HIV status using a series of string variables
>
> Many thanks, Tom and Howie!
>
> Tom: Your solution worked beautifully, except for one tiny thing.  I got a few people who weren't assigned to one of the three groups, and their codes all had one thing in common...an "indeterminate" test between their negative and positive tests:
>
> hiv
> ......NI..PP
> .....NIP....
> ......NI...P
>
> I can very easily just recode these people to be incident seroconverters, but I wonder if there is an easy fix for the code that would do this automatically?
>
> Howie: Many thanks for the explanation...that clears up a lot of my confusion!
>
> BTW: my apologies if this post ends up in the wrong spot - I'm still trying to understand how to reply to individual postings when I receive Statalist in digest form...I'm hoping that slapping a "RE:" in front of the subject line I wish to respond to will allow me to do that, but I couldn't find information in the FAQ on specifically how to do this.
>
> Cheers,
> Chelsea
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- RE: st: Categorizing HIV status using a series of string variables
  - From: "Polis, Chelsea B." <[email protected]>
- RE: st: Categorizing HIV status using a series of string variables
  - From: Howard Lempel <[email protected]>

Prev by Date: Re: st: Bivariate density contours
Next by Date: st: Problems with outreg
Previous by thread: RE: st: Categorizing HIV status using a series of string variables
Next by thread: how to reply on Statalist [was: RE: st: Categorizing HIV status using a series of string variables]
Index(es):
- Date
- Thread