Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Count of unique cases by group
From
"Ben Hoen" <[email protected]>
To
<[email protected]>
Subject
RE: st: Count of unique cases by group
Date
Fri, 6 Jan 2012 11:01:33 -0500
Perfect. Thanks Nick! Your code has a number of additional insights (for
me) in it over and above answering the questions.
Best,
Ben
Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Friday, January 06, 2012 10:47 AM
To: [email protected]
Subject: Re: st: Count of unique cases by group
By "unique" you evidently mean "distinct". (For more on that
distinction, see the 2008 paper below or the manual entry for
-duplicates-.)
Variants of this question have been discussed in
FAQ . . . . . . . . . . . . . . Calculating the number of distinct
values
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J.
Cox
9/06 How do I calculate the number of distinct
values seen so far?
http://www.stata.com/support/faqs/data/distinctvalues.html
FAQ . . . . . . . . . Counting distinct strings across a set of
variables
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J.
Cox
7/04 How do I count the number of distinct strings
across a set of variables?
http://www.stata.com/support/faqs/data/distinctstrings.html
FAQ . . . . . . . . . . . . . . . . . . . Number of distinct
observations
. . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and G.
Longton
4/02 How do I compute the number of distinct observations?
http://www.stata.com/support/faqs/data/distinct.html
SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct
observations
(help distinct if installed) . . . . . . N. J. Cox and G. M.
Longton
Q4/08 SJ 8(4):557--568
shows how to answer questions about distinct observations
from first principles; provides a convenience command
This may help
sysuse auto, clear
egen tag = tag(rep78 mpg)
egen distinct = total(tag), by(rep78)
bysort rep78 : gen freq = _N
gen fraction = distinct / freq
tabdisp rep78, c(distinct freq fraction)
----------------------------------------------
Repair |
Record |
1978 | distinct freq fraction
----------+-----------------------------------
1 | 2 2 1
2 | 6 8 .75
3 | 15 30 .5
4 | 11 18 .6111111
5 | 8 11 .7272727
. | 0 5 0
----------------------------------------------
Nick
On Fri, Jan 6, 2012 at 3:30 PM, Ben Hoen <[email protected]> wrote:
> I want to count the number of unique cases within a group to generate a
> summary table. .collapse gets me part of the way there, but not all of
the
> way.
>
> sysuse auto, clear
> g make2=word(make,1)
> rename rep78 area
> keep make2 area price
> keep in 1/20
> drop if area==.
> sort area make2
> order area make2
> list
>
> Using these records, I would like to produce a table with a statistic
> representing the count of the unique groups of make2 in each area, divided
> by the count of make2 in each area.
>
> i.e. pctcount=count of groups of make2 / count of make2
>
> So the output table would look like this ideally (minus the "underlying
> calculation" column):
>
> area pctcount <underlying calculation>
> 2 0.6666 2/3
> 3 0.25 4/12
> 4 1 2/2
> 5 1 1/1
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/