|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Fast way to calculate Gini coefficients
A program you haven't mentioned is -somersd-, which can also be used to calculate Gini coefficients, and can be downloaded from SSC. To do this in a Stata session, type
ssc desc somersd
for a brief description, and
ssc install somersd, replace
to install the package, and
net get somersd
to copy the 3 .pdf manuals for the -somersd- package to your current local folder. The manual -somersd.pdf- contains an example of the use of -somersd- for calculating a Gini coefficient. This example also appears in Newson (2006a).
I do not know whether -somersd- is faster or slower than the alternatives that you mention. However, it uses the algorithm of Newson (2006b), which calculates a confidence interval in a time asymptotically proportional to NlogN, instead of in a time asymptotically proportional to the square of N, where N is the sample number.
If you are calculating Gini coefficients for a large number of subsets, then the -parmby- module of the -parmest- package might be useful. The -parmest- package can also be downloaded from SSC, using the -ssc- command.
I hope this helps.
Best wishes
Roger
References
Newson R. 2006a. Confidence intervals for rank statistics: Somers' D and extensions. The Stata Journal 6(3): 309-334.
Newson R. 2006b. Efficient calculation of jackknife confidence intervals for rank statistics. Journal of Statistical Software 15(1): 1-10.
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Philipp Rehm
Sent: 15 May 2009 14:29
To: [email protected]
Subject: st: Fast way to calculate Gini coefficients
Dear all,
does someone know about the fastest way to compute gini coefficients of several income variables over many sub-groups (say, 2000) of a large dataset (with several million observations)? I am using Stata 10.1 on Windows XP.
There are many user-written programs calculating gini coefficients. I have been using the following:
(1) -egen_inequal- (by Michael Lokshin and Zurab Sajaia), followed by a -collapse- (although this can be sped up by using -keep-) and
(2)-fastgini- (by Zurab Sajaia), looping over groups and collecting the results in a matrix.
Both ways of calculation take pretty long (I haven't timed them against each other). Any ideas?
Many thanks,
Philipp
--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/