[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

[no subject]

Asad's sample data will help fix ideas: 

hhold	id	s_id	age	s_age
23	2	1	30	35
23	1	2	35	30
23	4	.	65	.
23	3	.	3	.
45	2	1	50	40
45	1	2	40	50
45	6	2	30	50
45	8	.	5	.
45	5	.	5	.
45	4	.	8	.
45	7	.	2	.
45	3	.	12	.

Here is a revised stab at the problem: 

1. Count how many times each spouse identifier 
occurs within each household: 

. bysort hhold s_id : gen ns = _N * (s_id <  .) 

2. If any spouse identifier occurs more than 
once, the individuals must be wives with 
the same husband. The person they have 
in common, their husband, is the appropriate 
identifier for a group of husband and wives. 
(No sexism here; it's only the choice to 
make the problem tractable.) 
 
. gen g_id = s_id if ns > 1  

3. At this point we are confident of the
status of those groups of wives and also 
of unmarried individuals. We will label 
those OK. (That is, OK = 1 for these
and OK = 0 for others.) 

. gen OK = g_id < . | s_id == . 

4. We are left with 

a. husbands with two or more wives 

b. husbands and wives who are just 
married to each other (monogamous 
couples). 

We'll take them one at a time. 

5. In the case of a. the appropriate 
group identifier is that of the individual 
concerned (a husband, so his id is already 
in use as a group identifier for his wives, 
from step 2). 

. replace g_id = id if g_id == . & s_id < . 

6. That was the right thing to do in the 
case of group identifiers previously 
created. They have been tagged as -OK-. 
More generally, the right value of 
-OK- is the largest so far assigned
for each group identifier. 

. bysort hhold g_id (OK) : replace OK = OK[_N] 

7. The only individuals now to be assigned 
are monogamous couples. These are tagged 
by -OK- of 0. A systematic
way to identify them is by assigning 
the minimum of the two ids. That 
is, if person 1 has spouse 2 and 
person 2 has spouse 1, then we 
can give them both group
identifiers of min(1,2) i.e. 1. 
It doesn't matter whether the 
group identifier is that of the 
husband or the wife as none of 
the individuals has any other spouse. 

. replace g_id = min(id,s_id) if OK == 0 

  hhold        id      s_id        ns       g_id     
     23         1         2         1          1     
     23         2         1         1          1     
     23         3         .         .          .     
     23         4         .         .          .     
     45         2         1         1          2     
     45         1         2         2          2     
     45         6         2         2          2     
     45         5         .         .          .     
     45         8         .         .          .     
     45         4         .         .          .     
     45         7         .         .          .     
     45         3         .         .          .     

So, we have now identified groups within households, 
of one husband and one or more wives. 
Although the example contains no households with 
more than one husband, the method should apply 
to those as well. 

To get mean ages 

bysort hhold g_id ns :  egen meanage = mean(age) if ns 
by hhold g_id : gen S_age = meanage[2] if _n == 1 
by hhold g_id : replace S_age = meanage[1] if S_age == . 


Nick 
[email protected] 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Husbands and wives [was: st: RE: RE: RE: data ... ]
Next by Date: st: correlate by group and collapse
Previous by thread: st: Husbands and wives [was: st: RE: RE: RE: data ... ]
Next by thread: st: correlate by group and collapse
Index(es):
- Date
- Thread