Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: AW: Bug? proportion not using value labels
From
"Martin Weiss" <[email protected]>
To
<[email protected]>
Subject
st: AW: Bug? proportion not using value labels
Date
Thu, 25 Feb 2010 10:12:00 +0100
<>
Certainly not a bug! As Tirthankar said, the labels must be valid Stata
-varname-s, which strings with embedded blanks or slashes in them are not.
So you have to find a way to edit your labels to become valid -varname-s.
Maybe NJC`s -findit labutil- can do that. See in the example which labels
make it to the table and which ones do not:
***
sysuse nlsw88, clear
prop industry
***
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Zoe Hyde
Gesendet: Donnerstag, 25. Februar 2010 08:08
An: [email protected]
Betreff: st: Bug? proportion not using value labels
Hello all,
I am trying to get the proportion command to use the value labels associated
with a variable to identify subpopulations, but have possibly run into a
bug. It seems the value labels are only used if they only contain
characters in the sets A-Za-z, 0-9 and these sets do not overlap.
For example, this doesn't work:
label define agegroup_cat 1 "75-79" 2 "80-84" 3 "85-89" 4 "90+"
label values agegroup agegroup_cat
tab agegroup
Age (w3) | Freq. Percent Cum.
------------+-----------------------------------
75-79 | 1,316 40.16 40.16
80-84 | 1,335 40.74 80.90
85-89 | 516 15.75 96.64
90+ | 110 3.36 100.00
------------+-----------------------------------
Total | 3,277 100.00
proportion had_sex, over(agegroup)
Proportion estimation Number of obs = 2783
_prop_1: had_sex = 0
_prop_2: had_sex = 1
_subpop_1: agegroup = 75-79
_subpop_2: agegroup = 80-84
_subpop_3: agegroup = 85-89
_subpop_4: agegroup = 90+
--------------------------------------------------------------
Over | Proportion Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1 |
_subpop_1 | .6043668 .0144572 .5760189 .6327147
_subpop_2 | .7196429 .0134276 .6933137 .745972
_subpop_3 | .8142202 .0186477 .7776554 .8507849
_subpop_4 | .8902439 .0347317 .8221413 .9583465
-------------+------------------------------------------------
_prop_2 |
_subpop_1 | .3956332 .0144572 .3672853 .4239811
_subpop_2 | .2803571 .0134276 .254028 .3066863
_subpop_3 | .1857798 .0186477 .1492151 .2223446
_subpop_4 | .1097561 .0347317 .0416535 .1778587
--------------------------------------------------------------
But if groups 1 and 4 only contain characters from a single set, then it
does work:
label define agegroup_cat2 1 "75" 2 "80to84" 3 "85 to 89" 4 "Ninetyplus"
label values agegroup agegroup_cat2
. tab agegroup
Age (w3) | Freq. Percent Cum.
------------+-----------------------------------
75 | 1,316 40.16 40.16
80to84 | 1,335 40.74 80.90
85 to 89 | 516 15.75 96.64
Ninetyplus | 110 3.36 100.00
------------+-----------------------------------
Total | 3,277 100.00
proportion had_sex, over(agegroup)
Proportion estimation Number of obs = 2783
_prop_1: had_sex = 0
_prop_2: had_sex = 1
75: agegroup = 75
_subpop_2: agegroup = 80to84
_subpop_3: agegroup = 85 to 89
Ninetyplus: agegroup = Ninetyplus
--------------------------------------------------------------
Over | Proportion Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1 |
75 | .6043668 .0144572 .5760189 .6327147
_subpop_2 | .7196429 .0134276 .6933137 .745972
_subpop_3 | .8142202 .0186477 .7776554 .8507849
Ninetyplus | .8902439 .0347317 .8221413 .9583465
-------------+------------------------------------------------
_prop_2 |
75 | .3956332 .0144572 .3672853 .4239811
_subpop_2 | .2803571 .0134276 .254028 .3066863
_subpop_3 | .1857798 .0186477 .1492151 .2223446
Ninetyplus | .1097561 .0347317 .0416535 .1778587
--------------------------------------------------------------
Although I could use the key to work out which groups are which, I am
sending this output off to another dataset (with parmest) to produce some
graphs. It's a real pain if I have to manually edit the dataset/re-label
variables for every graph I want to produce.
Does anyone have any ideas on how I can get proportion to use the value
labels I have defined, no matter what characters they contain?
Zoe.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/