Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Mean of interval-censored data
From
<[email protected]>
To
<[email protected]>
Subject
st: Mean of interval-censored data
Date
Mon, 28 Nov 2011 09:11:37 -0000
------------------------------
Date: Sun, 27 Nov 2011 11:28:58 -0600
From: Paul von Hippel <[email protected]>
Subject: st: Mean of interval-censored data
I am interested in using the intcens command for Stata to model an
income distribution. I wonder if I can ask for your advice on how the
program handles weights, and on whether it can output expected values.
Many thanks for your time.
I have interval-censored data on the distribution of family income
within various school districts. The data for one district are below
my signature. bin_min and bin_max are the endpoints of the interval
(the top interval has only one endpoint), and fb is the number of
families in the interval. I would like to estimate the distribution of
income and derived quantities, most importantly the mean.
Two questions, if I may:
1. Will intcens provide me the mean income, or will I need to
calculate it from the parameters of the distribution?
2. It looks to me as though intcens can handle weights through a
command like this: "intcens bin_min bin_max fb". But if this is the
syntax, how can intcens tell that fb is a weight and not a regressor?
Finally: Is there a different command that I should be using for this
purpose?
Thanks for any advice.
Best wishes,
Paul von Hippel
fb bin_min bin_max
21 0 10000
22 10000 14999
29 15000 19999
105 20000 24999
80 25000 29999
155 30000 34999
159 35000 39999
68 40000 44999
138 45000 49999
210 50000 59999
264 60000 74999
324 75000 99999
123 100000 124999
129 125000 149999
75 150000 199999
110 200000 .
================================
You could use -intcens- but the distributions that it fits are not ones
that are commonly used to describe income distributions. I suggest that
you instead look at something like -gbgfit- (by Austin Nichols on SSC),
as this handles interval-censored (grouped) data. You'd have to recode
your bin boundary variables slightly to make the program work (see the
help file).
If it doesn't take weights then you should be able to fudge this
pre-estimation by multiplying your frequency variable ("fb"?) by your
weights. [Of course this is frequency weighting; not accounting for
design weights etc.]
Once you have the parameter estimates, you'd be able to calculate a
number of distributional summary statistics from the saved results. (See
also my -gb2fit- on SSC.)
Stephen
------------------
Professor Stephen P. Jenkins <[email protected]>
Department of Social Policy and STICERD
London School of Economics and Political Science
Houghton Street, London WC2A 2AE, UK
Tel: +44(0)20 7955 6527
Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
2011, http://ukcatalogue.oup.com/product/9780199226436.do
Survival Analysis Using Stata:
http://www.iser.essex.ac.uk/survival-analysis
Downloadable papers and software: http://ideas.repec.org/e/pje7.html
Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/