Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: basic qiestion
From
Eric Booth <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: Re: basic qiestion
Date
Tue, 23 Aug 2011 02:02:53 +0000
<>
I got this reply from Nadine off-list--
On Aug 22, 2011, at 8:40 PM, Nadine Brooks wrote:
> Thanks Phil and Eric but even with egen I can not solve my problem.
>
> I am working with a survey data with 410,241 individual from all ages.
> Some of them work and other not. Some the variables that i wnat to sum
> is:
>
> v9532: income from main job
> v9982: income from secondary job
> v1022: income from the third or more jobs
>
> so only 170,014 indivuduals work, so when I use egen
> sal=rowtotal(v9535 v9982 v1022) I will have people with income equal
> zero...
>
> Take a look:
>
> sum v9532 v9982 v1022
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> v9532 | 170014 831.5625 1451.442 3 120000
> v9982 | 8326 686.3957 1179.807 1 48000
> v1022 | 672 957.75 1422.576 8 11000
>
> egen sal=rowtotal(v9535 v9982 v1022)
>
> . sum v9532 v9982 v1022 sal
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> v9532 | 170014 831.5625 1451.442 3 120000
> v9982 | 8326 686.3957 1179.807 1 48000
> v1022 | 672 957.75 1422.576 8 11000
> sal | 410241 15.88779 225.3688 0 48000
>
> Now I have all the individuals in my survey data with some income,
> even zero. But I dont want that.
>
The zeros are from observations where all three v* variables are missing.
The help file entry for -egen, rowtotal()- says:
... It creates the (row) sum of the variables in varlist, treating missing as 0. If missing is
specified and all values in varlist are missing for an observation, newvar is set to missing.
So, you can change your code
gen sal=rowtotal(v9535 v9982 v1022)
to
gen sal=rowtotal(v9535 v9982 v1022), missing
...
> Now I have all the individuals in my survey data with some income,
> even zero. But I dont want that.
>
> After your advice I had tried also: egen sal=rowtotal(v9535 v9982
> v1022) if v9535>0
> because who has the 2nd and or 3th job must have the first (main). But
> it did not work as well
Note that the reason that:
egen sal=rowtotal(v9535 v9982 v1022) if v9535>0
won't work as you intend is that missing values (.) are also greater than 0 (see help missing), so "if v9535>0" will evaluate to true when v9535 is missing, even though you expect that it would evaluate to false.
Returning to my example, you would run:
****
clear
input v9535 v9102 v1022
3 4 5
5 . 6
9 . .
. . .
1 1 1
end
egen sal3 = rowtotal(v9535 v9102 v1022), missing
list
*****
- Eric
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/