In addition to the suggestions by Maarten and Philipp, another way would be:
clear
input str6 householdid str8 personid infected
010101 01010101 1
010101 01010102 1
010102 01010201 0
010102 01010202 1
010102 01010203 1
010103 01010301 0
010103 01010302 0
010103 01010303 0
010104 01010401 0
010104 01010402 1
end
gen hinfect = infect == 1
bys hous (hinfect): replace hinfect = hinfect[_N]
l, sepby(householdid)
Scott
> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Honorati Masanja
> Sent: Wednesday, November 29, 2006 4:08 PM
> To: [email protected]
> Subject: st: Filling gaps???
>
> Dear all
>
> I have a dataset with individuals in households. Each individual has a
> unique identifier. Some individuals in the households are infected and
> some are not. My problem is how do I tell Stata to create a new variable
> which will have 1 for households with at least one infected person and
> 0 for households without infected persons. The datasets looks like this
>
> HouseholdID PersonID Infected
> 010101 01010101 1
> 010101 01010102 1
> 010102 01010201 0
> 010102 01010202 1
> 010102 01010203 1
> 010103 01010301 0
> 010103 01010302 0
> 010103 01010303 0
> 010104 01010401 0
> 010104 01010402 1
>
> Many thanks
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/