Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Creating household id for groups of persons
From
Robert Picard <[email protected]>
To
[email protected]
Subject
Re: st: Creating household id for groups of persons
Date
Wed, 6 Jul 2011 13:01:26 -0400
Unless I'm mistaken, Fernando's solution will not always group
correctly households. In the simple example below, there are 3
contracts with 4 different members of the same household. Such cases
require more that one pass over the data (contract 13 groups id 2 and
4 and then contract 11 and 12 groups 1 2 3 4 together).
* --------------------- begin example ---------------------
clear all
input contract id
11 1
11 2
12 3
12 4
13 2
13 4
end
tempfile f
qui save "`f'"
* implement Fernando's approach
egen cid = group(contract)
bysort id: egen mincid = min(cid)
bysort contract: egen hid = min(mincid)
list , noobs clean
* redo using -group_id-
use "`f'", clear
clonevar hid = id
group_id hid, match(contract)
list , noobs clean
* --------------------- end example -----------------------
On Wed, Jul 6, 2011 at 11:47 AM, Hans Meier <[email protected]> wrote:
> Hello Austin and Robert,
>
> thank you for your solutions.
> I'm sure they would work, but I have a very large dataset, so Austins solution would take hours, and for Roberts solution I would have to use SSC.
>
> But another Stata user sent me this solution:
>
> Von: "Fernando Rios Avila" <[email protected]>
> Gesendet: 06.07.2011 15:18:00
> An: "'Hans Meier'" <[email protected]>
> Betreff: RE: st: Creating household id for groups of persons
>
>>Hi Hans,
>>I was playing around with a very small sample similar to yours, and come up with this small code.
>>Here hid3 would be the household id code.
>>
>> egen hid1=group (contract)
>> bysort id: egen hid2=min(hid)
>> bysort contract:egen hid3=min(hid2)
>>
>>Hope this is what u were looking for.
>>Best
>
>
> It works perfect, and very fast.
>
> Thank you Fernando!
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: "Robert Picard" <[email protected]>
> Gesendet: 06.07.2011 16:50:42
> An: [email protected]
> Betreff: Re: st: Creating household id for groups of persons
>
>>Or get -group_id- from SSC. Using Austin's data:
>>
>>* --------------------- begin example ---------------------
>>clear all
>>input contract id
>> 123 1
>> 123 2
>> 123 3
>> 456 4
>> 456 5
>> 678 1
>> 456 3
>> 789 6
>> 789 7
>> 456 8
>>end
>>
>>clonevar gid = id
>>group_id gid, match(contract)
>>
>>list , noobs clean
>>
>>* --------------------- begin example ---------------------
>>
>>
>>On Wed, Jul 6, 2011 at 10:29 AM, Austin Nichols <[email protected]> wrote:
>>> Hans Meier <[email protected]>:
>>>
>>> Maybe this is what you want?
>>>
>>> clear all
>>> input contract id
>>> 123 1
>>> 123 2
>>> 123 3
>>> 456 4
>>> 456 5
>>> 678 1
>>> 456 3
>>> 789 6
>>> 789 7
>>> 456 8
>>> end
>>> g long obs=_n
>>> egen long i=group(id)
>>> la var i "Person id from 1 to M"
>>> egen long gp=group(contract)
>>> la var gp "Contract id from 1 to G"
>>> bys i (gp):g long ct=sum(gp!=gp[_n-1])
>>> la var ct "n distinct contract by id"
>>> sort i ct
>>> su i, mean
>>> forv i=1/`r(max)' {
>>> su ct if i==`i', mean
>>> if r(max)==1 continue
>>> loc max=r(max)
>>> su gp if ct==1&i==`i', mean
>>> loc g1=r(max)
>>> forv j=2/`max' {
>>> su gp if ct==`j'&i==`i', mean
>>> replace gp=`g1' if gp==r(max)
>>> }
>>> }
>>> sort obs
>>> drop obs ct i
>>> l, noo clean
>>>
>>>
>>>
>>> On Wed, Jul 6, 2011 at 8:45 AM, Hans Meier <[email protected]> wrote:
>>>> Yes, now you got my question right.
>>>> I don't know who lives in in which household, and I also don't have further information about this.
>>>>
>>>> But I assume, that if people have an insurance contract together, they are somehow connected and I define them as one household.
>>>> (I look only at non-life insurance, no pension funds etc.)
>>>>
>>>> In my example, I define the persons from contract "123" (id's "1", "2", "3") as one household, let's say household A, and those in contract "456" (id's "4", "5") as another household B.
>>>> Now, in contract "678", the id "1" tells me that this is the same person who is also in the contract "123", so I want this contract to be put in household A.
>>>>
>>>> To your question:
>>>> Unfortunately, I have a very large dataset, so I can't tell if I have one contract in each household that covers all household members.
>>>> To err on the side of caution, I would rather assume I don't have such complete contracts.
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>
>>*
>>* For searches and help try:
>>* http://www.stata.com/help.cgi?search
>>* http://www.stata.com/support/statalist/faq
>>* http://www.ats.ucla.edu/stat/stata/
>
>
> ___________________________________________________________
> Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die
> Toolbar eingebaut! http://produkte.web.de/go/toolbar
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/