Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Grouping income variables- RECODE COMMAND
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Grouping income variables- RECODE COMMAND
Date
Sun, 2 Feb 2014 08:58:56 +0000
Your -recode- mapped 1,...,11 to 1,...,11, which makes precisely no
progress with the main problem. As I understand what you want, you
need something more like
recode hinctnt 1=40 2=70 3=130 ...
Nick
[email protected]
On 1 February 2014 19:43, Antonio Rodriguez Andres
<[email protected]> wrote:
> Nıck
>
> You are right. But ıf I type the following code
>
> recode hinctnt (1=1 "1st interval") (2=2 "2nd interval") (3=3 "3rd interval") (4=4 "4th interval") (5=5 "5th interval") (6=6 "6th interval") (7=7 "7th interval") (8=8 "8th interval") (9=9 "9th interval") (10=10 "10th interval") (11=11 "11th interval") (12=12 "12th interval") (.=.m "Missing") (77=.r "Refusal") (88=.d "Don't Know") (99=.s "Not answer"), gen (ihinctnt)
>
> I generate a new variable ihinctnt. Then I tabulated and I compute summary statistics. But these are not incomes. I should specify the upper and lower linıt for each interval. How can I do it
>
>
> tab ihinctnt, missing
>
> RECODE of
> hinctnt
> (Household's
> total net
> income, all
> sources) Freq. Percent Cum.
>
> 1st interval 1,663 3.87 3.87
> 2nd interval 1,561 3.63 7.50
> 3rd interval 2,262 5.26 12.76
> 4th interval 3,676 8.55 21.31
> 5th interval 3,545 8.24 29.55
> 6th interval 3,293 7.66 37.21
> 7th interval 3,010 7.00 44.21
> 8th interval 2,871 6.68 50.89
> 9th interval 4,707 10.95 61.83
> 10th interval 2,058 4.79 66.62
> 11th interval 644 1.50 68.12
> 12th interval 428 1.00 69.11
> Don't Know 3,540 8.23 77.34
> Missing 5,037 11.71 89.06
> Refusal 4,525 10.52 99.58
> Not answer 180 0.42 100.00
>
> Total 43,000 100.00
>
> . summ ihinctnt
>
> Variable Obs Mean Std. Dev. Min Max
>
> ihinctnt 29718 6.156504 2.75604 1 12
>
> . summ ihinctnt,d
>
> RECODE of hinctnt (Household's total net income,
> all sources)
>
> Percentiles Smallest
> 1% 1 1
> 5% 1 1
> 10% 2 1 Obs 29718
> 25% 4 1 Sum of Wgt. 29718
>
> 50% 6 Mean 6.156504
> Largest Std. Dev. 2.75604
> 75% 9 12
> 90% 10 12 Variance 7.595757
> 95% 10 12 Skewness -.080652
> 99% 12 12 Kurtosis 2.098037
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: Saturday, February 01, 2014 9:17 PM
> To: [email protected]
> Subject: Re: st: Grouping income variables- RECODE COMMAND
>
> The numeric values of -hinctnt- don't exceed 99. They are evidently numeric codes, not incomes. So, why you are surprised at your results?
> You have to -recode- your data before you can classify them. And that means the -recode- command.
> Nick
> [email protected]
>
>
> On 1 February 2014 18:14, Antonio Rodriguez Andres <[email protected]> wrote:
>> Here you can see the basic description of the income variable
>>
>> tab hinctnt
>>
>> Household's |
>> total net |
>> income, all |
>> sources | Freq. Percent Cum.
>> ------------+-----------------------------------
>> J | 1,663 4.38 4.38
>> R | 1,561 4.11 8.49
>> C | 2,262 5.96 14.45
>> M | 3,676 9.68 24.13
>> F | 3,545 9.34 33.47
>> S | 3,293 8.67 42.15
>> K | 3,010 7.93 50.08
>> P | 2,871 7.56 57.64
>> D | 4,707 12.40 70.04
>> H | 2,058 5.42 75.46
>> U | 644 1.70 77.15
>> N | 428 1.13 78.28
>> Refusal | 4,525 11.92 90.20
>> Don't know | 3,540 9.32 99.53
>> No answer | 180 0.47 100.00
>> ------------+-----------------------------------
>> Total | 37,963 100.00
>>
>>
>> sum hinctnt, d
>>
>> Household's total net income, all sources
>> -------------------------------------------------------------
>> Percentiles Smallest
>> 1% 1 1
>> 5% 2 1
>> 10% 3 1 Obs 37963
>> 25% 5 1 Sum of Wgt. 37963
>>
>> 50% 7 Mean 22.67271
>> Largest Std. Dev. 31.57352
>> 75% 10 99
>> 90% 77 99 Variance 996.8872
>> 95% 88 99 Skewness 1.378759
>> 99% 88 99 Kurtosis 2.984444
>>
>> .
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Nick Cox
>> Sent: Saturday, February 01, 2014 7:52 PM
>> To: [email protected]
>> Subject: Re: st: Grouping income variables- RECODE COMMAND
>>
>> Your code shows you using the -recode()- function, which is quite different from the -recode- command. In Stata functions and commands are different!
>>
>> I think that to comment helpfully we need to see more about your
>> -hinctnt-, for example, the results of
>>
>> . su hinctnt, detail
>>
>> Your categories are not disjoint as (e.g.) the definitions [70, 120] and [120, 230] leave ambiguous what happens with 120. Alternatively, your notation here confuses the meaning of [ ] and ( ).
>> Nick
>> [email protected]
>>
>>
>> On 1 February 2014 17:29, Antonio Rodriguez Andres <[email protected]> wrote:
>>> Dear Stata users,
>>>
>>> I have to group the income variable in different intervals. In the
>>> original dataset, the household income variable is grouped İnto 12
>>> categories
>>>
>>> J <40
>>> R [40,70]
>>> C [70, 120]
>>> M [120, 230]
>>> F [230, 350]
>>> S
>>> K
>>> P
>>> D
>>> H
>>> U [1730, 2310)
>>> N > 2310
>>>
>>> I want to group J and R categories <70 Euros, and create dummy
>>> variables for all income groups. That is the Stata ouput. I used the
>>> recode command But it does not work
>>>
>>> gen hinc_gr=recode(hinctnt, 70, 120, 230, 350, 460, 580, 690, 1150,
>>> 1730,
>>> 2310)
>>> (13282 missing values generated)
>>>
>>> . tab hinc_gr
>>>
>>> hinc_gr | Freq. Percent Cum.
>>> ------------+-----------------------------------
>>> 70 | 29,718 100.00 100.00
>>> ------------+-----------------------------------
>>> Total | 29,718 100.00
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/