Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Friends' characteristics


From   "Carter Rees" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: Friends' characteristics
Date   Thu, 31 Aug 2006 00:37:40 -0400

Chris,

A bit of code (with naming conventions changed from original) provided to me
previously by Maarten Buis later commented upon by Nick Cox.  My question
was essentially the same as yours and a small bit of modification of this
code helped immensely.  The final format will allow you to calculate a mean
across your gpa variables.  

The original conversation can be found here:  
http://www.stata.com/statalist/archive/2006-03/msg00612.html

So, Nick is correct in that there is a -merge- solution.

note:  this will create the gpa variables associated with each nominated
friend
drop _all
tempfile a
input frnd gpa
      99   2.5
      88   3.1
      77   4
      66   1.8
      55   3.6
      44   2.9
end
sort frnd
save test, replace

drop _all
input aid frnd1 frnd2 frnd3 frnd4
      99   66    77   .     .    
      88   77    99   .     .    
      77   55    44   99
      66   88    99   44   77
      55   44    .    .    .
      44   66    .    .    .
end


reshape long frnd, i(aid)
drop if frnd ==.
sort frnd
merge frnd using test
drop if _merge == 2
drop _merge
reshape wide frnd gpa, i(aid) j(_j)
list
save test2, replace

Carter Rees
School of Criminal Justice
University at Albany, SUNY


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Wednesday, August 30, 2006 7:12 PM
To: [email protected]
Subject: st: RE: Friends' characteristics

There is probably a -merge- solution. 

In this case, at worst, a solution is a single 
loop over observations. 

gen gpa_f = .

qui forval i = 1/`=_N' {
	/// next line may wrap 
	su gpa if inlist(id,`=friend1[`i']', `=friend2[`i']',
`=friend3[`i']', `=friend4[`i']') , meanonly
	replace gpa_f = r(mean) in `i'
}

If your ids are string, then you need instead 

inlist(id,"`=friend1[`i']'", "`=friend2[`i']'", "`=friend3[`i']'",
"`=friend4[`i']'") 

Nick 
[email protected] 

Chris Ruebeck
 
> Suppose my data set has these 6 variables,
> 
> 	id : this respondent's ID,
> 	gpa : this respondent's GPA, and
> 	friend1-4 : the IDs (possibly missing) of this 
> respondent's friends.
> 
> I would like to create four new variables that record the GPA 
> of each  
> respondent's friends, and then take their average.  I have many  
> observations and want to avoid slower methods.  Here is my code for  
> the first friend.
> 
> gen gpaf1 = .
> egen group = group(friend1)
> summarize group, meanonly
> foreach num 1 / `r(max)' {
> 	summarize id if group==`num', meanonly
> 	local idf = r(mean)
> 	summarize gpa if id==`idf', meanonly
> 	replace gpaf1 = r(mean) if group==`num'
> }
> 
> I figure I can nest this in a forvalues loop from 1-4, and then use - 
> egen ... rowmean(gpaf1-4)- to get the mean over friends.  In 
> the code  
> above, levelsof could replace the -egen ... group(friend1)- 
> but macro  
> length limits would require splitting the friends' ids into two to  
> four groups.
> 
> Is there a faster method, perhaps with Mata?
> 
> (An additional wrinkle: some friends may no longer be in the  
> database---so an observation's friend1, for example, may contain a  
> number that is not the id of any observation.  I think the 
> code above  
> is robust to that problem, but perhaps this is another potential  
> speed improvement.)

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index