|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: R: highly skewed, highly zeroed data
As an alternative to Kieran's hint, due to the positive skewness of his data
Jason may find useful to calculate the desired 95CI% by fitting a Gamma
distribution and drawing 10,000 random values from it (for two interesting
references, please see:
Briggs, A. and Nixon, R. and Dixon, S. and Thompson, S. (2005). Parametric
modelling of cost data: some simulation evidence. Health Economics 14(4):pp.
421-428; free downloadable at http://eprints.gla.ac.uk/4151/;
Briggs A, Sculpher M, Claxton K. Decision Modelling for Health Economic
Evaluation. Oxford: Oxford University Press, 2006: 77-120).
............................begin example.................................
input time wt
mean time [fweight = wt]
Mean estimation Number of obs = 647
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
time | 1.605873 .2343624 1.145669 2.066077
--------------------------------------------------------------
set obs 10000
g Gamma=(.2343624^2/1.605873)*invgammap((1.605873/.2343624)^2, uniform())
sum Gamma
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
Gamma | 10000 1.605746 .2343959 .8457972 2.601775
centile Gamma, centile (2.5 97.5)
-- Binom. Interp. --
Variable | Obs Percentile Centile [95% Conf. Interval]
-------------+-------------------------------------------------------------
Gamma | 10000 2.5 1.177285 1.170511 1.187588
| 97.5 2.09881 2.083514 2.114182
............................end example....................................
HTH and Kind Regards,
Carlo
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Jason Ferris
Inviato: mercoledì 25 novembre 2009 3.07
A: [email protected]
Oggetto: st: highly skewed, highly zeroed data
Hi,
I have tried to find my answer in the statalist repository but nothing
has quite hit the mark.
I would like to calculate a mean and 95% CI of this data - which is
highly skewed and the majority are zeros.
I am aware of adding a constant and the transforming on the log scale
(with antilog) for interpretation. However after adding a constant to
overcome the zero issue and then transforming on the log scale I am
still left with a highly skewed distribution. Which gets me no close to
a mean and CI.
PS. As this is survey data I would be most keen for the 'right' answer
to be addressed in svy: terms
Jason
time (hrs) | Freq. Percent Cum.
------------+-----------------------------------
0 | 518 80.06 80.06
.25 | 2 0.31 80.37
.5 | 3 0.46 80.83
1 | 15 2.32 83.15
1.5 | 1 0.15 83.31
2 | 23 3.55 86.86
3 | 10 1.55 88.41
3.5 | 1 0.15 88.56
4 | 11 1.70 90.26
5 | 13 2.01 92.27
6 | 9 1.39 93.66
7 | 3 0.46 94.13
8 | 19 2.94 97.06
20 | 10 1.55 98.61
45 | 9 1.39 100.00
------------+-----------------------------------
------------------------------------------
DISCLAIMER: This message (including any attachments) is intended solely for
the addressee(s) named and may contain confidential or privileged
information.
If you are not the intended recipient, please delete it and notify the
sender.
Views expressed in this message are those of the individual sender,and are
not necessarily the views of the Turning Point Alcohol and Drug Centre (ABN:
68 223 819 017).
<a href="http://www.turningpoint.org.au">Turning Point Alcohol and Drug
Centre</a>
Although this message and any attachments have been scanned for viruses by
'Trend Micro InterScan' at the time of sending, you are advised to rescan on
receipt.
The whole or parts of this email may be subject to copyright of Turning
Point Alcohol and Drug Centre (ABN: 68 223 819 017), and/or third parties.
You can only re-transmit, distribute or use the material if you are
authorised to do so.
Please consider the environment before printing this email or attachments.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/