| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Re: functions of different subsets of data?
Your message is quite confusing -- it's not really clear what you want to
do, most of the Stata commands you show are not valid syntax and you
reference values of agegroup that do not appear in the data you show. I
think you are trying to share your frustration with the list? Based on my
best guess, you want the means of certain variables for certain observations
and then want to perform a calculation with them. This task is probably
best done using local macros or scalars.
gen byte mysubset= agegroup >=1825 & agegroup <=10950
sum y if mysubset==1
local ymean1=r(mean)
sum y if mysubset==0
local ymean0=r(mean)
sum x if mysubset==1
local xmean1=r(mean)
sum x if mysubset==0
local xmean0=r(mean)
local slope=(`ymean'-`ymean0') / (`xmean1'-`xmean0')
You can make the calculation into a short ado program so that you can use it
in more situations
program define myslope, rclass
version 9.2
syntax varlist (min=2 max=2) [if], subset(str)
tempvar subs
gen byte `subs'=`subset'
marksample touse
foreach var of local varlist {
qui sum `var' if `touse' & `subs'==0
local `var'mean0=r(mean)
qui sum `var' if `touse' & `subs'==1
local `var'mean1=r(mean)
}
local y: word 1 of `varlist'
local x : word 2 of `varlist'
local b=(``y'mean1'-``y'mean0') / (``x'mean1'-``x'mean0')
noi di "slope = `b'"
return local slope=`b'
end
Here's a sample syntax:
myslope x y, subset( agegroup >=1825 & agegroup <=10950)
Michael Blasnik
----- Original Message -----
From: "Gustaf Rydevik" <[email protected]>
To: <[email protected]>
Sent: Wednesday, December 06, 2006 9:34 AM
Subject: st: functions of different subsets of data?
Hi all,
I'm fairly new to stata, and still get frustrated when trying to
figure out certain simple things. The major issue is the lack of
ability to create constants, and being forced to only use variables.
I'm probably missing something, but that's what it seems like to me.
Right now, I'm trying to do a primitive regression-type calculation.
I have two columns , Xi and Yi, and would like to calculate:
1) The means for the subsets that belong to
10950(30yrs)>=agegroup>=1825 (5yrs), and 21900>=agegroup>10950,
separately and for both columns.
2) Then calculate the slope (Yi_mean2-Yi_mean1)/(Xi_mean2-Xi_mean2)
The best thing I've come up with so far is to generate temporary
variables with NA's where not defined using statements like:
gen Xi_mean1 = (sum) Xi if agegroup >=1825 & agegroup <=10950
replace Xi_mean1=Xi_mean1/6
,and then calculating 2) refering to a specific entry in the columns
of the temporary variables with subsetting. However, Because I can't
know which row number contain a real value (everything is taking place
inside a .do file), and since I'm dealing with constants, this
roundabout way bothers me.
Is there a more "natural" way to do things?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/