Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re-re-post: Stata 11 - Factor variables in a regression command

From	Michael Norman Mitchell <[email protected]>
To	[email protected]
Subject	Re: Re-re-post: Stata 11 - Factor variables in a regression command
Date	Fri, 30 Apr 2010 23:42:50 -0700

Dear Ricardo

  The command

. logistic y a#b

includes just the interaction of "a by b", and does not include themain effect of a, nor the main effect of b. By contrast, the command


. logistic y a##b

  includes the main effect of a, the main effect of b, as well as the a by b interaction. It is equivalent to typing

. logistic y a#b a b

As John Fox describes in his regression book, a properly formedregression model which contains an interaction will also include the alllower order main effects. In other words, when including a#b, you alsoinclude a and b. There are instances where one could omit the maineffects, but only if you know exactly why you are doing so andunderstand the ramifications in terms of the intepretation of the termsin the model.


  I hope that is helpful.

Michael N. Mitchell
See the Stata tidbit of the week at...
http://www.MichaelNormanMitchell.com

On 2010-04-30 10.48 PM, Ricardo Basurto wrote:

Not the best way to start posting to StataList, is it? I am
re-arranging my message hoping that at least that way my question
won't be cut out. (If anyone has suggestions on how to successfully
submit messages from within Gmail, I would appreciate those as well.)

--------------------------------------------------------------------------------------------------------------------------------------------------------

I am having trouble understanding the difference between a regression
that uses a cross operator (#) and one that uses a cross factorial
operator (##).
For example, below is the output I get from running two different
regressions.  From the log-likelihood ratio, chi2, etc, it seems clear
to me that both commands are fitting the same regression model.  Also,
I can reproduce the second regression by fitting a regression with
dummies for a=1, b=1, and a variable equal to the multiplication of
those two dummies; however, I just can't figure out what exact model
is being fitted in the first regression. Can anyone explain this?

Thank you,

Ricardo

REGRESSION #1:

. logistic y a#b

Logistic regression                             Number of obs   =      19670
                                               LR chi2(3)      =       7.71
                                               Prob>  chi2     =     0.0525
Log likelihood = -1473.1898                     Pseudo R2       =     0.0026

----------------------------------------------------------------------------
          y | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Int.]
-----------+----------------------------------------------------------------
        a#b |
       0 1  |   1.567419   .2804138     2.51   0.012     1.1038    2.2256
       1 0  |   1.447424   .2588797     2.07   0.039     1.0194    2.0551
       1 1  |   1.211988   .2246236     1.04   0.300     .84283    1.7428
----------------------------------------------------------------------------


REGRESSION #2

. logistic y a##b

Logistic regression                             Number of obs   =      19670
                                               LR chi2(3)      =       7.71
                                               Prob>  chi2     =     0.0525
Log likelihood = -1473.1898                     Pseudo R2       =     0.0026

----------------------------------------------------------------------------
          y | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Int.]
-----------+----------------------------------------------------------------
        1.a |   1.447424   .2588797     2.07   0.039     1.0194    2.0551
        1.b |   1.567419   .2804138     2.51   0.012     1.1038    2.2256
            |
        a#b |
       1 1  |   .5342167   .1302597    -2.57   0.010     .33125    .86152
----------------------------------------------------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: Re-re-post: Stata 11 - Factor variables in a regression command
  - From: Richard Williams <[email protected]>
- Re: Re-re-post: Stata 11 - Factor variables in a regression command
  - From: Richard Williams <[email protected]>

References:
- Re-re-post: Stata 11 - Factor variables in a regression command
  - From: Ricardo Basurto <[email protected]>

Prev by Date: Re-re-post: Stata 11 - Factor variables in a regression command
Next by Date: st: parmby: how to get value lables in output?
Previous by thread: Re-re-post: Stata 11 - Factor variables in a regression command
Next by thread: Re: Re-re-post: Stata 11 - Factor variables in a regression command
Index(es):
- Date
- Thread