Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: functions of different subsets of data?


From   "Michael Blasnik" <[email protected]>
To   <[email protected]>
Subject   st: Re: functions of different subsets of data?
Date   Wed, 06 Dec 2006 10:10:57 -0500

Your message is quite confusing -- it's not really clear what you want to do, most of the Stata commands you show are not valid syntax and you reference values of agegroup that do not appear in the data you show. I think you are trying to share your frustration with the list? Based on my best guess, you want the means of certain variables for certain observations and then want to perform a calculation with them. This task is probably best done using local macros or scalars.

gen byte mysubset= agegroup >=1825 & agegroup <=10950
sum y if mysubset==1
local ymean1=r(mean)
sum y if mysubset==0
local ymean0=r(mean)
sum x if mysubset==1
local xmean1=r(mean)
sum x if mysubset==0
local xmean0=r(mean)
local slope=(`ymean'-`ymean0') / (`xmean1'-`xmean0')

You can make the calculation into a short ado program so that you can use it in more situations

program define myslope, rclass
version 9.2
syntax varlist (min=2 max=2) [if], subset(str)
tempvar subs
gen byte `subs'=`subset'
marksample touse
foreach var of local varlist {
qui sum `var' if `touse' & `subs'==0
local `var'mean0=r(mean)
qui sum `var' if `touse' & `subs'==1
local `var'mean1=r(mean)
}
local y: word 1 of `varlist'
local x : word 2 of `varlist'
local b=(``y'mean1'-``y'mean0') / (``x'mean1'-``x'mean0')
noi di "slope = `b'"
return local slope=`b'
end

Here's a sample syntax:

myslope x y, subset( agegroup >=1825 & agegroup <=10950)

Michael Blasnik



----- Original Message ----- From: "Gustaf Rydevik" <[email protected]>
To: <[email protected]>
Sent: Wednesday, December 06, 2006 9:34 AM
Subject: st: functions of different subsets of data?



Hi all,

I'm fairly new to stata, and still get frustrated when trying to
figure out certain simple things. The major issue is the lack of
ability to create constants, and being forced to only use variables.
I'm probably missing something, but that's what it seems like to me.

Right now, I'm trying to do a primitive regression-type calculation.

I have two columns , Xi and Yi, and would like to calculate:

1) The means for the subsets  that belong to
10950(30yrs)>=agegroup>=1825 (5yrs), and 21900>=agegroup>10950,
separately and for both columns.

2) Then calculate the slope (Yi_mean2-Yi_mean1)/(Xi_mean2-Xi_mean2)

The best thing I've come up with so far is to generate temporary
variables with NA's where not defined using statements like:

gen Xi_mean1 = (sum) Xi if agegroup >=1825 & agegroup <=10950
replace Xi_mean1=Xi_mean1/6

,and then calculating 2) refering to a specific entry in the columns
of the temporary variables with subsetting. However, Because I can't
know which row number contain a real value (everything is taking place
inside a .do file), and since I'm dealing with constants,  this
roundabout way bothers me.

Is there a more "natural" way to do things?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index