Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: difference in medians . Raw vs calculated
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: difference in medians . Raw vs calculated
Date
Sat, 19 Jan 2013 21:17:59 -0500
Correction:
4. Roger Newson's -bpmedian- package (SSC) can a Bonett-Price CI for
the _difference between medians_, not just for a single median.
Steve
> Richard-
>
> Thanks for illustrating your problem with an accessible data set. Too
> few posters do. That said, nothing strange is going on here.
>
> 1. -cendif- estimates the "generalized Hodges-Lehmann median
> difference", which is the median of possible draws of two observations,
> one from each population. This is not the same as the "difference in
> medians".
>
> 2. The output for -cid- clearly states that the command is computing a
> difference in means, not medians.
>
> 3. Example 1 in the -help- for -qreg- discusses why the estimated regression coefficient might not be the difference in medians.
>
> 4. Roger Newson's -bpmedian- package (SSC) estimates a Bonett-Price CI for
> the median.
>
> By the way, the p50 for a group is _not_ necessarily the sample median:
>
> .tab weight if foreign
>
>
> Weight |
> (lbs.) | Freq. Percent Cum.
> ------------+-----------------------------------
> 1,760 | 1 4.55 4.55
> 1,830 | 1 4.55 9.09
> 1,930 | 1 4.55 13.64
> 1,980 | 1 4.55 18.18
> 1,990 | 1 4.55 22.73
> 2,020 | 1 4.55 27.27
> 2,040 | 1 4.55 31.82
> 2,050 | 1 4.55 36.36
> 2,070 | 1 4.55 40.91
> 2,130 | 1 4.55 45.45
> 2,160 | 1 4.55 50.00
> 2,200 | 1 4.55 54.55
> 2,240 | 1 4.55 59.09
> 2,280 | 1 4.55 63.64
> 2,370 | 1 4.55 68.18
> 2,410 | 1 4.55 72.73
> 2,650 | 1 4.55 77.27
> 2,670 | 1 4.55 81.82
> 2,750 | 1 4.55 86.36
> 2,830 | 1 4.55 90.91
> 3,170 | 1 4.55 95.45
> 3,420 | 1 4.55 100.00
> ------------+-----------------------------------
> Total | 22 100.00
>
>
> Notice n = 22, an even number of observations, so the median is not
> unique. By convention, it is the midpoint between the two middle observations,
> the 11th and 12th, which is, for this data. (2160 +2200)/2 = 2180.
> But it could be any value between 2160 and 2200.
>
> Steve
>
> Steven J. Samuels
> Consultant in Statistics
> 18 Cantine's Island
> Saugerties NY 12477 USA
> Voice: 845-246-0774
>
> On Jan 19, 2013, at 8:00 PM, Richard Hiscock wrote:
>
> I wish to derive 95%CI for difference in medians and noticed that difference in raw median values between groups didn't equal that calculated using packages cendif (R.Newson) and cid (P.Royston) Clearly Im missing something and would be grateful for an explanation.
>
> I suspect it relates to a transformation performed prior to calculation of the difference & subsequent back transformation to original units.
>
> However it is hard to present raw unit median values and the the difference in medians (& CI) which are not the same. In my data set (plasma protein assay) the raw difference in medians is 0.5 whereas the difference calculated by cid or cendif is 0.33 making it hard to explain to readers.
>
> Thanks for any advice
>
>
>
> Illustrated using the auto data set:
>
>
>
> Use auto
>
> tabstat weight, by(foreign) stats(p50)
>
>
>
> Summary for variables: weight by categories of: foreign (Car type)
>
>
>
> foreign | p50
>
> ---------+----------
>
> Domestic | 3360
>
> Foreign | 2180
>
> ---------+----------
>
> Total | 3190
>
> --------------------
>
>
>
> *difference = 1180
>
>
>
>
>
> . cendif weight, by(foreign)
>
> Y-variable: weight (Weight (lbs.))
>
> Grouped by: foreign (Car type)
>
> Group numbers:
>
>
>
> Car type | Freq. Percent Cum.
>
> ------------+-----------------------------------
>
> Domestic | 52 70.27 70.27
>
> Foreign | 22 29.73 100.00
>
> ------------+-----------------------------------
>
> Total | 74 100.00
>
> Transformation: Fisher's z
>
> 95% confidence interval(s) for percentile difference(s)
>
> between values of weight in first and second groups:
>
> Percent Pctl_Dif Minimum Maximum
>
> 50 1095 750 1330
>
>
>
> . cid weight,by(foreign) unpaired
>
>
>
> Normal-based confidence interval for difference in means by foreign
>
>
>
> Variable | Obs Estimate Std. Err. [95% Conf. Interval]
>
> ---------+-------------------------------------------------------------
>
> weight | 74 1001.206 160.2876 681.6788 1320.734
>
>
>
> . qreg weight foreign
>
> Iteration 1: WLS sum of weighted deviations = 34840.693
>
>
>
> Iteration 1: sum of abs. weighted deviations = 34860
>
> note: alternate solutions exist
>
> Iteration 2: sum of abs. weighted deviations = 34620
>
> note: alternate solutions exist
>
> Iteration 3: sum of abs. weighted deviations = 34580
>
>
>
> Median regression Number of obs = 74
>
> Raw sum of deviations 48860 (about 3180)
>
> Min sum of deviations 34580 Pseudo R2 = 0.2923
>
>
>
> ------------------------------------------------------------------------------
>
> weight | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>
> -------------+----------------------------------------------------------------
>
> foreign | -1150 223.2969 -5.15 0.000 -1595.134 -704.8659
>
> _cons | 3350 121.7526 27.51 0.000 3107.291 3592.709
>
> ------------------------------------------------------------------------------
> //www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/