Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Combining four variables into one
From
Amal Khanolkar <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Combining four variables into one
Date
Wed, 15 Aug 2012 11:07:55 +0000
Thanks Nick,
Those were very simple and straightforward ways of combining variables. In the first option, does 'max' indicate the max possible value i.e. 1 in this case?
Both ways suggested by you give me the same total of 80,346. But I was expecting a total of 81,360. Could some of the subjects with multiple diagnoses be counted just once, i.e. the first time they appear as coded as 1?
. tab gestht
gestht | Freq. Percent Cum.
------------+-----------------------------------
0 | 2,911,110 97.31 97.31
1 | 80,346 2.69 100.00
------------+-----------------------------------
Total | 2,991,456 100.00
. egen gesthtx = rowmax(gestht1 gestht2 gestht3 gestht4)
. tab gesthtx
gesthtx | Freq. Percent Cum.
------------+-----------------------------------
0 | 2,911,110 97.31 97.31
1 | 80,346 2.69 100.00
------------+-----------------------------------
Total | 2,991,456 100.00
Thanks,
/Amal.
________________________________________
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: 15 August 2012 12:48
To: [email protected]
Subject: Re: st: Combining four variables into one
At a guess, you should not -replace- any of these variables as they
all might be useful and needed for something else. Consider
gen gestht = max(gestht1, gestht2, gestht3, gesht4)
or
egen gesht = rowmax(gestht1 gestht2 gestht3 gestht4)
On a small point of English: one diagnosis, two diagnoses.
This kind of question bolsters my prejudice that the functions
(including -egen- functions) are one of the most neglected parts of
Stata. See also
Cox, N.J. 2011. Speaking Stata: Fun and fluency with functions.
The Stata Journal 11(3): 460-471
Abstract. Functions are the unsung heroes of Stata. This column is a
tour of functions that might easily be missed or underestimated, with
a potpourri of tips, tricks, and examples for a wide range of basic
problems.
Nick
On Wed, Aug 15, 2012 at 11:31 AM, Amal Khanolkar <[email protected]> wrote:
> I have the following four variables, where 1 indicates diagnoses for a particular type of hypertension. As I don't have sufficient number of cases when I take into account my covariates, I now need to combine these four variables to create 1 variable; 0=no
> diagnoses of hypertension, and 1=diagnoses of any type of hypertension (in any of the four variables). Some subjects might have multiple diagnoses. Is there a better and easier way to do this than using the replace command?
>
>
>
> I would also like to create a variation of the combined variable such that each subject is entered only one if she has multiple diagnoses to compare it with the other combined variable, where all multiple diagnoses for a subject are inluded.
>
>
>
> tab gestht1
>
>
>
> Chronic/ess |
>
> ential |
>
> hypertensio |
>
> n O10-O11 & |
>
> 642A-C, H | Freq. Percent Cum.
>
> ------------+-----------------------------------
>
> 0 | 2,986,530 99.84 99.84
>
> 1 | 4,926 0.16 100.00
>
> ------------+-----------------------------------
>
> Total | 2,991,456 100.00
>
>
>
> . tab gestht2
>
>
>
> Gestational |
>
> hypertensio |
>
> n O13 & |
>
> 642D, 642X | Freq. Percent Cum.
>
> ------------+-----------------------------------
>
> 0 | 2,970,036 99.28 99.28
>
> 1 | 21,420 0.72 100.00
>
> ------------+-----------------------------------
>
> Total | 2,991,456 100.00
>
>
>
> . tab gestht3
>
>
>
> Preeclampsi |
>
> a or |
>
> eclampsia |
>
> O14, O15 & |
>
> 642E-G | Freq. Percent Cum.
>
> ------------+-----------------------------------
>
> 0 | 2,936,962 98.18 98.18
>
> 1 | 54,494 1.82 100.00
>
> ------------+-----------------------------------
>
> Total | 2,991,456 100.00
>
>
>
> . tab gestht4
>
>
>
> Preexisting |
>
> hypertensio |
>
> n with |
>
> preeclampsi |
>
> a O11 & |
>
> 642H | Freq. Percent Cum.
>
> ------------+-----------------------------------
>
> 0 | 2,990,936 99.98 99.98
>
> 1 | 520 0.02 100.00
>
> ------------+-----------------------------------
>
> Total | 2,991,456 100.00
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/