--- On Fri, 5/6/09, Havedd Wadf <[email protected]> wrote:
> The data set has more than 3000 lines, each of which
> has three parts:
> X1, ...,Xn,Y1,....,Yn,Z1,...Zt
> First, for each line, I do a simple regression:
> Y=a+b*X
> Second, for each line, I use the Z's and the
> estimated a and b to do some calculations.
Looks like a problem for -reshape- and -statsby-. Below is
an example with just two observations, but it should also
work for 3000 observations.
*-------------- begin example ---------------
// create some example data
drop _all
input x1 x2 x3 y1 y2 y3 z1 z2 z3
1 2 3 4 3 5 7 8 9
3 1 2 4 5 6 6 7 8
end
list
// -reshape- needs an id variable
// here create such an id which is 1 for the first
// observation, 2 for the second, etc.
gen caseid = _n
// reshape the dataset into long format
reshape long x y z, i(caseid) j(sortid)
// store it in a temporary dataset
// so that we can merge the regression coefficients
// in at a later time
tempfile tofill
sort caseid sortid
save `tofill'
// create a dataset with the regression coefficients
// for each observation (caseid)
statsby, by(caseid) clear : regress y x
list
// merge these regression coeficients back into the
// reshaped dataset
sort caseid
merge caseid using `tofill'
assert _merge == 3
drop _merge
sort caseid sortid
list
*------------------ end example -----------------
Hope this helps,
Maarten
-----------------------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/