Hello,
I am having trouble in figuring out how to create some
peer variables for my data. I have a dataset of kids
in households and i need to calculate variables that
reflect the position of a kid (say, aged 6-17) among
other kids (in same age group) in the residential
neighbourhood.
If i wanted to obtain simple peer variables, then the
solution is to calculate the -peer variable- of
interest for each kid (by excluding him/her from the
calculation), following the rule explained in FAQ
section on STATA webpage :
http://stata.com/support/faqs/data/members.html
With little modification of the rule, own siblings can
also be net out from the above calculation of peer
vars.
Problem is in creating variables like, say �relative
birth order� (RBO), for every kid in the sample using
only data on -all other kids in the neighbourhood-
where we need to: (a) exclude own siblings in
calculation & (b) retain the individual kid (for whom
the variable is being calculated) in calculation.
For kids with no siblings, there is little problem. It
is the presence of own siblings that i am finding
difficult to deal with. IN contrast to common peer
group variables that are constant for kids from same
families, this (within neighbourhood) R/BO may be
different for siblings of different age within a
family. Some form of nested loop (may be as an
extension of the cited FAQ) is necessary, but I
haven�t been able to construct the general form of it.
Once done, I can calculate lots of other variables
(e.g. what % of kids in the peer group are older in
age, of same age etc. etc.). Any help will be
appreciated. I further elaborate my case below.
Thanks. Asad.
RBO (relative birth order) is calculated as:
g RBO=(BO-1)/(Tot_KIDS-1)
where, "BO" (absolute birth order) is:
egen BO=rank(age*(include)),by(Nid) field
But presence of own siblings mean that both BO and
Tot_KIDS (total no. of other kids in neighborhood) are
incorrect.
For correct Tot_KIDS , we could follow the STATA FAQ
(by Nick) ...
g Tot_KIDS =.
qui sum hid
qui forvalues i = 1 / `r(max)' {
g include=1 if hid!=`i' & age>=6 & age<18
egen Tot_KIDS_1=sum((age>=6 & age<18)*include),by(Nid)
egen Tot_KIDS _2=sum((age>=6 &
age<18)*include),by(family)
replace Tot_KIDS = Tot_KIDS_1- Tot_KIDS_2 if hid==`i'
drop include Tot_KIDS_*
}
For BO, however, i need a way to exclude own siblings
while executing the following for each kid in the
neighborhood:
egen BO=rank(age),by(Nid) field
Then, I guess, my problem is solved. Following is
sample data where BO is the birth order (within
neighborhood) variable to be calculated for each kid.
�Nid� is neighborhood id, �family� is family id etc.
Nid family person age BO
1 1 1 40 .
1 1 2 35 .
1 1 3 12 2
1 1 4 6 1
1 1 5 7 1
1 2 1 36 .
1 2 2 31 .
1 2 3 8 3
1 3 1 40 .
1 3 2 12 4
1 3 3 12 4
1 3 4 8 3
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/