Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Counting observations within groups
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Counting observations within groups
Date
Thu, 29 Nov 2012 17:59:38 -0500
Daniel Escher <[email protected]>:
Make an empty variable, loop over counties, filling in values as you
go, something like this:
su totprod, mean
loc m=r(mean)
qui levelsof fips, loc(fs)
g long nbig=.
foreach f of loc fs {
qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &fips==`f'
replace nbig = r(N) if fips==`f'
}
Sometimes the list to loop over can get too long, in which case:
su totprod, mean
loc m=r(mean)
egen i=group(fips)
su i, mean
forv i=1/`r(max)' {
qui count if (totprod>`m'&totprod<.)&(sic==12110|sic==11110) &i==`i'
replace nbig = r(N) if i==`i'
}
is an alternative.
On Thu, Nov 29, 2012 at 5:48 PM, Daniel Escher <[email protected]> wrote:
> Hello,
>
> I am trying to count the number of mines in a county by production.
> I.e., I'd like the number of mines in each county that are above the
> overall mean of production, and the number that are below. There are
> multiple mines per county, which is identified by its FIPS code.
> Missing data are marked by . The data are in long format.
>
> Here's what I have so far:
> . *bigmines = # of mines in a county above the overall mean
> . *totprod = total production per mine
> . *sic = type of mine
>
> . *ATTEMPT ONE
> . sort fips
> . su totprod // to get mean
> . by fips: egen bigmines = count(inrange(totprod, r(mean), .) &
> sic==12110 | sic==11110) // This gives me total number of mines per
> FIPS code - not those that meet the criteria
> . drop bigmines
>
> . *ATTEMPT TWO
> . su totprod // to get mean
> . by fips: egen bigmines = total(mshahrs > r(mean) & sic==12110 |
> sic==11110) // This gives me the total number of mines per FIPS code
> if any mine exceeds the mean
> . drop bigmines
>
> . *ATTEMPT THREE
> . *Then I read Nick Cox's helpful article
> (http://www.stata-journal.com/sjpdf.html?articlenum=pr0029) which
> clued me in to -count-:
> . gen bigmines = 0
> . su totprod
> . count if inrange(totprod, r(mean), .) & sic==12110 | sic==11110
> . replace bigmines = r(N)
>
> The last attempt is what I want, and it "works." However, I don't know
> how to -count- and then store r(N) for each FIPS code. Using -by- does
> not seem to work. This probably requires a loop like...
>
> forvalues j = all values of fips {
> count if inrange(mshahrs, r(mean), .) & sic==12110 | sic==11110
> replace bigmines_hrs = r(N)
> }
>
> Is this close? Thank you so much for your help and time.
>
> Gratefully,
> Daniel
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/