*flag each person just once
bysort hhid persid: gen byte perperson=(_n==1)
* calculate number of persons per household
egen totpeople=sum(perperson), by(hhid)
* flag each household once, to avoid duplicates in list commands
bysort hhid: gen byte taghh=(_n==1)
l hhid if taghh==1 & totpeople==1
l hhid if taghh==1 & totpeople==2
.. etc..
Michael Blasnik
[email protected]
----- Original Message -----
From: "Donnell Butler" <[email protected]>
To: <[email protected]>
Sent: Thursday, March 25, 2004 6:10 AM
Subject: st: Counts of different values in one variable by another variable
> Good Day,
>
> I am trying to do something which I imagine must be easy to do in Stata,
but
> I can't find the solution in the manuals, help books, or FAQ online.
> Clearly, I am not thinking clearly, because this seems like a simple
> request. Perhaps, I just don't know how to phrase the question correctly
in
> my search for the answer. Nevertheless, I am hoping that someone can help
or
> direct me to an existing response that my answer my question with the
> Statalist archive number or month/year.
>
>
> Here is a simplified version of my dilemma:
>
> I have a data set with multiple id numbers. There are is always one
> primary id (hhid), but sometimes there are more than one subsidiary ids
> (persid). The persid is simply two digits more than the hhid. For example
> hhid= 12345 and persid=1234501 (or in the cases where there is more than
> one, persid=1234501, 1234502, 1234503, etc. The records are structured
> such that for every action on a given date, there is a record. For
> example:
>
> HHID PERSID ACTION DATE
> 12345 1234501 EAT 1/1/2003
> 12345 1234501 DRINK 1/2/2003
> 12345 1234501 DRINK 1/3/2003
> 12345 1234501 BE MERRY 1/4/2003
> 12345 1234502 DRINK 1/1/2003 <-Note new person id, but same hhid
> 12345 1234502 EAT 1/3/2003
> 12345 1234503 BE MERRY 1/2/2003 <-Note new person id, but same
> hhid
> 12346 1234601 BE MERRY 1/1/2003 <-Note new hhid
>
> ... and so on.
>
> So, here is my dilemma, I am trying to find a command or commands that
> will do two things:
> (1) For the entire data set, across all households, how many times are
> there 1,2,3,...N numbers of unique PERSIDs within a household? That is,
> how many households have 1,2,3,... N persons.
> (2) Display the HHID for households that have X number of persons? That
> is, for households with X number of unique PERSIDS within a household,
> list the HHIDS.
>
> It seems so simple, but the count command can't count within variables.
> The egen command can't work with by commands. Clearly, there is an obvious
> answer but I can't seem to figure it out. Please help.
>
> Thanks,
> Donnell
>
> Donnell Butler
> Ph.D. Candidate
> Princeton University
> 125 Wallace Hall
> Princeton, NJ 08540
> 609-419-1311
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/