Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one line.
Date   Mon, 28 Jun 2010 20:30:16 +0200

<>
NJC`s solution is much better than mine. Mine leaves behind residuals for
all observations in your dataset even though they never entered the
estimation sample. Their meaning is hence dubious. So stick to Nick`s code,
I would say...


HTH
Martin


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Dani Tilley
Sent: Montag, 28. Juni 2010 19:51
To: [email protected]
Subject: Re: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one
line.

Thanks for your response. I also think they should match and the # on the
obs should be the the number of the observations used in the regression, not
the total observations. The MW snippet is missing a condition in the
"predict res`lev', res" line. If you compare the residuals from these two,
you'll notice the discrepancy.

///MW
sysuse auto, clear
qui levelsof rep78

foreach lev in 
`r(levels)'{
    qui regress price weight length if rep78==`lev'
    predict res`lev', res
}


///NJC
sysuse auto, clear
qui levelsof rep78
gen residual = .

foreach lev in `r(levels)'{
   tempvar foo
   qui regress price weight length if rep78==`lev'
   predict `foo', res
   replace residual = `foo' if rep78 == `lev'
   drop `foo'
}

If, in MW, we say "predict res`lev' if rep78==`lev', res", the problem is
fixed. This is all I meant.

Thanks a lot to Martin and you for the help.


Best,
DF Tilley

----- Original Message ----
From: Nick Cox <[email protected]>
To: [email protected]
Sent: Mon, June 28, 2010 1:35:54 PM
Subject: RE: AW: st: RE: AW: RE: AW: Regressing and storing residuals in one
line.

I can answer one of these questions; otherwise I am not clear what you
are puzzled about as I can't see any problem with the code suggested. 

The number of observations for the composite residuals variable should
be the sum of the numbers of observations included in the separate
regressions. If any observation was excluded from a regression, the
corresponding residual should be missing. 

That would be a consequence of your data, which we can't see. 

Minima and maxima should match, as I understand it. 

Nick 
[email protected] 

Dani Tilley

Sorry, I completely missed that. 

I also tried a loop structurally similar to the one you suggested, but
noticed the summarize res* output is different from the summarize
residuals output from NJC's suggestion. I understand that your loop
stores the residuals in separate variables (one for each category),
while NJC creates an empty variable and populates it on the fly, but
shouldn't say the minimum or maximum residuals from the two outputs
match? Shouldn't the smallest value from the min column of summarize
res* (MW) output be the same as the Min from summarize residuals (NJC)?
In addition shouldn't the sum of the obs column from the summarize res*
(MW) output be _N? 

I'm very new to Stata, so I don't really know if this makes sense at all
but I think this is the correct way to get the residuals using the loop
you suggested:

predict res`lev' if country == `lev', res

From: Martin Weiss <[email protected]>

Having -drop-ped it, you cannot access it anymore. But NJC`s strategy is
that the results you are interested in are gathered inside the permanent
"residual" variable, so this is not a drawback.

Dani Tilley

If I define a tempvar and drop it at the end of the loop, can I still
refer
to it elsewhere in the program (i.e. outside the loop)?

From: Nick Cox <[email protected]>

If you are doing this lots of times for real, you could end up with
storage problems with dozens of temporary variables. If that doesn't
bite, then OK. 

Martin Weiss

The 

*************
drop `foo'
*************

line could be safely omitted, btw. Stata just makes up new tempnames,
and
discards them all at the conclusion of the do-file.

Nick Cox

Such residuals have rather poorly defined properties, but let's set that
on one side. 

A single variable can be obtained through a minor variation on Martin's
recipe: 

sysuse auto, clear
qui levelsof rep78
gen residual = . 

foreach lev in `r(levels)'{
    tempvar foo 
    qui regress price weight length if rep78==`lev'
    predict `foo', res
    replace residual = `foo' if rep78 == `lev'
    drop `foo' 
}

Nick 
[email protected] 

Martin Weiss

Just loop through the thing:

*************
sysuse auto, clear
qui levelsof rep78

foreach lev in `r(levels)'{
    qui regress price weight length if rep78==`lev'
    predict res`lev', res
}
*************

Dani Tilley

I'm trying to run several regressions (one for each level of a
categorical
variable) and store the residuals from each regression in a local macro
or
new 
variable I could later manipulate. I figured I could use:

bysort category: regress y x1 x2 

to run the regressions, but I need a second line of code (predict name,
residuals) to get the residuals when bysort allows only one. Is there a
way
around this? 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index