Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Bug? proportion not using value labels

From	"Martin Weiss" <[email protected]>
To	<[email protected]>
Subject	st: AW: Bug? proportion not using value labels
Date	Thu, 25 Feb 2010 10:12:00 +0100

<> 


Certainly not a bug! As Tirthankar said, the labels must be valid Stata
-varname-s, which strings with embedded blanks or slashes in them are not.
So you have to find a way to edit your labels to become valid -varname-s.
Maybe NJC`s -findit labutil- can do that. See in the example which labels
make it to the table and which ones do not:

***
sysuse nlsw88, clear
prop industry
***


HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Zoe Hyde
Gesendet: Donnerstag, 25. Februar 2010 08:08
An: [email protected]
Betreff: st: Bug? proportion not using value labels

Hello all,
 
I am trying to get the proportion command to use the value labels associated
with a variable to identify subpopulations, but have possibly run into a
bug.  It seems the value labels are only used if they only contain
characters in the sets A-Za-z, 0-9 and these sets do not overlap.
 
For example, this doesn't work:
 
label define agegroup_cat 1 "75-79" 2 "80-84" 3 "85-89" 4 "90+"
label values agegroup agegroup_cat
tab agegroup
 
   Age (w3) |      Freq.     Percent        Cum.
------------+-----------------------------------
      75-79 |      1,316       40.16       40.16
      80-84 |      1,335       40.74       80.90
      85-89 |        516       15.75       96.64
        90+ |        110        3.36      100.00
------------+-----------------------------------
      Total |      3,277      100.00
 
proportion had_sex, over(agegroup)
Proportion estimation               Number of obs    =    2783
      _prop_1: had_sex = 0
      _prop_2: had_sex = 1
    _subpop_1: agegroup = 75-79
    _subpop_2: agegroup = 80-84
    _subpop_3: agegroup = 85-89
    _subpop_4: agegroup = 90+
--------------------------------------------------------------
        Over | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1      |
   _subpop_1 |   .6043668   .0144572      .5760189    .6327147
   _subpop_2 |   .7196429   .0134276      .6933137     .745972
   _subpop_3 |   .8142202   .0186477      .7776554    .8507849
   _subpop_4 |   .8902439   .0347317      .8221413    .9583465
-------------+------------------------------------------------
_prop_2      |
   _subpop_1 |   .3956332   .0144572      .3672853    .4239811
   _subpop_2 |   .2803571   .0134276       .254028    .3066863
   _subpop_3 |   .1857798   .0186477      .1492151    .2223446
   _subpop_4 |   .1097561   .0347317      .0416535    .1778587
--------------------------------------------------------------
 
 
But if groups 1 and 4 only contain characters from a single set, then it
does work:
 
 
label define agegroup_cat2 1 "75" 2 "80to84" 3 "85 to 89" 4 "Ninetyplus"
label values agegroup agegroup_cat2

. tab agegroup
   Age (w3) |      Freq.     Percent        Cum.
------------+-----------------------------------
         75 |      1,316       40.16       40.16
     80to84 |      1,335       40.74       80.90
   85 to 89 |        516       15.75       96.64
 Ninetyplus |        110        3.36      100.00
------------+-----------------------------------
      Total |      3,277      100.00

proportion had_sex, over(agegroup)
Proportion estimation               Number of obs    =    2783
      _prop_1: had_sex = 0
      _prop_2: had_sex = 1
           75: agegroup = 75
    _subpop_2: agegroup = 80to84
    _subpop_3: agegroup = 85 to 89
   Ninetyplus: agegroup = Ninetyplus
--------------------------------------------------------------
        Over | Proportion   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1      |
          75 |   .6043668   .0144572      .5760189    .6327147
   _subpop_2 |   .7196429   .0134276      .6933137     .745972
   _subpop_3 |   .8142202   .0186477      .7776554    .8507849
  Ninetyplus |   .8902439   .0347317      .8221413    .9583465
-------------+------------------------------------------------
_prop_2      |
          75 |   .3956332   .0144572      .3672853    .4239811
   _subpop_2 |   .2803571   .0134276       .254028    .3066863
   _subpop_3 |   .1857798   .0186477      .1492151    .2223446
  Ninetyplus |   .1097561   .0347317      .0416535    .1778587
--------------------------------------------------------------
 
 
Although I could use the key to work out which groups are which, I am
sending this output off to another dataset (with parmest) to produce some
graphs.  It's a real pain if I have to manually edit the dataset/re-label
variables for every graph I want to produce.
 
Does anyone have any ideas on how I can get proportion to use the value
labels I have defined, no matter what characters they contain?
 
 
Zoe.
 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Bug? proportion not using value labels
  - From: "Zoe Hyde" <[email protected]>

Prev by Date: RE: st: LSM smoothing
Next by Date: st: New -hte- package available from SSC
Previous by thread: Re: st: Bug? proportion not using value labels
Next by thread: st: Polynomial/Quadritic trend in residuals
Index(es):
- Date
- Thread