Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Efficiently looping through countries and years counting and computing the percentage of people whom selected a specific answer

From	"J. J. W." <[email protected]>
To	[email protected]
Subject	st: Re: Efficiently looping through countries and years counting and computing the percentage of people whom selected a specific answer
Date	Thu, 6 Jun 2013 04:58:25 +0200

Dear all,

I have a small problem, which I have solved, but I was wondering whether:

- What the usual way is to do this?
- Can this be implemented more efficiently?

Suppose I have

Country Year Female

Netherlands 1990 1
Netherlands 1990 0
Netherlands 1990 1
Netherlands 1991 1
Netherlands 1991 1
Netherlands 1991 1
Netherlands 1992 1
Netherlands 1992 0
...

Well now I would like to calculate the amount of females as the
percentage of total. Now do this for every country for every year.
Well I've devised a script for it, presented below:

gen per_female= 0

/* Getting the maximum and minimum indices for countries */
su country_id, meanonly

/* For all different countries */
forvalues i = `r(min)'/`r(max)'{

su year if country_id == `i', meanonly
/* For all different years */
forvalues j = `r(min)'/`r(max)'{
count if country_id == `i' & female== 1 & year == `j'
local nr_females= r(N)
count if country_id == `i' & year == `j'& (female== 1 | female== 0)
        local nr_obser = r(N)
replace trust2 = `nr_females'/`nr_obser' if country_id == `i' & year == `j'
}
}

It basically works, however there are some problems.

a) I do not believe this is an efficient computation since there are a
LOT of cases there are no replacements at all. How can I make this
more efficient?

b) Is my way, "the way to go"? I believe this is more like programming
and I am wondering how this can be done more easily in STATA (even
though my method is relatively easy and straight forward).

c) At the moment you see that I did this: "(female== 1 | female== 0)",
basically this ensures that I only count the observations that I have
and eliminates the ones that I have missing values for (females == .).
Is this correct? Should I handle missing data in this way?

Any suggestions, advice or comments are very helpful and appreciated!

Thank you for your answer!

Wen Jun Jie
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Re: Efficiently looping through countries and years counting and computing the percentage of people whom selected a specific answer
  - From: tshmak <[email protected]>

Prev by Date: st: -matamatrix- program available on SSC
Next by Date: st: Wishlist for Stata 13 - index()
Previous by thread: st: -matamatrix- program available on SSC
Next by thread: st: RE: Re: Efficiently looping through countries and years counting and computing the percentage of people whom selected a specific answer
Index(es):
- Date
- Thread