> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Hans J. Baumgartner
> Sent: 24 March 2004 16:09
> To: statalist
> Subject: st: [Fwd: gen a between houshold id]
>
>
> Dear Statalist,
>
> Can anybody help me to generate a new variable that indicates the
> household of the parents. To clarify my data:
>
> persnr sex V05001 vnr v_hh
> 1 m 1 . .
> 2 f 1 . .
> 3 f 1 1 1
> 4 f 11 1 1
>
> persnr is a person identifier (1 = the father, 2 = the
> mother, 3 = the
> daughter that lives with the parents and 4 = the daughter
> that lives in
> another household).
> vnr identifies the persnr of the father.
>
> I would like to generate a new variable v_hh that identifies the
> houshold id (V05001) of the father. It is easy for the daughter that
> lives with her parents but I find it quite difficult to generate the
> between household link with the daughter that does not live with her
> parents.
>
> I thus constructed a loop that is copy/pasted below. However,
> I do have
> 21,000 obs. and I do have to run the loop twice, i.e. for
> both parents,
> which takes more than 3 hours.
>
> Can anybody help me to generat faster between household links?
>
> And to make thinks even more complicated. I have more than two
> generations in my data, that is, motherid and fatherid does not
> necessarily be missing.
>
> I appreciate all comments.
>
> Thanks
> Hans
>
>
> ==============================================================
> =========================
>
> sort persnr
> gen id=_n /* generiert lfd. Nummer f�r
> Schleife */
> sum id
> local end = r(max) /* generiert das Ende der
> Schleife */
>
> etime, start
>
> /* Vater */
> gen v_hh1 =.
> gen v_hh2 =.
> gen v_kein=0
> forvalues x = 1/`end' {
> if vnr[`x']>0 {
> gen temp_i = vnr[`x'] if vnr[`x']>0 /* write
> vnr[i] for all
> obs. */
> gen temp_d = (temp_i==persnr) /* Dummy for father */
> gen temp_hh= V05001 if temp_d==1 /* identifier
> for father
> HH */
> egen count=count(temp_hh)
> error count>1 /* error if
> >1 fathers */
> replace v_kein=1 in `x' if count==0 /* Dummy if no father
> found*/
> egen temp_pic= max(temp_hh) /* writes
> father HH for
> all obs */
> replace v_hh1 = temp_pic in `x' if HV[`x']==1 |
> partner[`x']==1
> /* picks father HH (STM) */
> replace v_hh2 = temp_pic in `x' if HV[`x']==0 |
> partner[`x']==0
> /* picks father HH (else)*/
> drop temp_* count
> }
> }
> etime
Presumably you have a variable in the data set that identifies families?
(I.e. groups of persnr 1/2/3/4 that come from the same family, though
not necessarily living in same household.) Variable "vnr" does not do
this (it is missing for a father apparently).
Assuming that you have this id variable (call it "fam_id"), then your
task should be straightforward (I think). The general principle is to
think "by group" operations, rather than loops.
e.g. -bysort fam_id: ....-
Stephen
-------------------------------------------------------------
Professor Stephen P. Jenkins <[email protected]>
Institute for Social and Economic Research
University of Essex, Colchester CO4 3SQ, U.K.
Tel: +44 1206 873374. Fax: +44 1206 873151.
http://www.iser.essex.ac.uk
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/