
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: Adjust after regression involving categorical variables

From   [email protected]
To   [email protected]
Subject   Re: st: AW: Adjust after regression involving categorical variables
Date   Thu, 29 Oct 2009 09:34:48 +0000

Thank you all for your helpful comments.  I think that -margins- may be a 
new command in Stata 11 and so I do not have access to it  (I'm still 
using Stata 10). 
The effects coding may be a way forward for me, I'll have to look into 
that a bit more.

Just to clarify my original question - what is the correct adjust command 
to get back to the original coefficients in the regression?  The first 
-adjust- leaves the categorical rep78 'as is', but the resulting 
predictions by foreign do not agree with the coefficients estimated from 
the regression.  However, the second -adjust- holds the categorical rep78 
at its values in the dataset, which Martin points out are the proportions 
that emerge for -proportion rep78-, and the resulting predictions this 
time correspond to the estimated regression coefficients.  So the second 
-adjust- command seems to be the 'correct' way to treat the categorical 
rep78.  Is this correct?  If so, I am struggling to understand what this 
means for variables that only take values of zero and one.  I know that it 
sets them to their proportions, but how does this make sense when you have 
a categorical variable such as sex, where you are either male or female? 
In this instance what would setting sex to a specific value, say 0.25, 
mean?  Also, if you are setting the categorical variable according to 
their proportions, what happens to the reference category?  This does not 
have a coefficient and so cannot be set to a specified value?

I hope that all this is understandable,  I am just not fully grasping why 
the second -adjust- command seems to work, and exactly what it is 
Thank you all for your input so far.


"Martin Weiss" <[email protected]> 
Sent by: [email protected]
28/10/2009 12:18
Please respond to
[email protected]

<[email protected]>

st: AW: Adjust after regression involving categorical variables


Maybe this FAQ will assist you:

I am not sure what the question really is. You are conducting two 
prediction exercises, and they lead to different outcomes, just as you 
expect: In the first one, Stata holds -rep78- at its values in the 
in the second one it assigns the mean to them. Note that these means are 
proportions that emerge for - proportion rep78 -...


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von
[email protected]
Gesendet: Mittwoch, 28. Oktober 2009 12:44
An: [email protected]
Betreff: st: Adjust after regression involving categorical variables

Dear all,

I am struggling to understand the -adjust- command after regression 
involving categorical variables.  My aim in using -adjust- is to obtain 
the predicted values adjusted for the categorical variable, but I am not 
explicitly interested in the categorical variable and so do not want it 
appearing in the -by()- option of -adjust-.  I have been unable to find 
any examples of this kind of use of -adjust-.  I have reproduced my query 
using the auto dataset below. I am using Stata 10.1 SE.

sysuse auto, clear

** just for this example, assume that rep78 is categorical
xi: regress price weight turn i.rep i.foreign

** output 
price           Coef.           Std. Err.               t       P>t [95% 
Conf.   Interval]
weight          4.243125        .6699849        6.33    0.000   2.903407 
turn            -208.6987       125.9326        -1.66   0.103   -460.5164 
_Irep78_2       822.0914        1691.818        0.49    0.629   -2560.907 
_Irep78_3       710.281         1560.7          0.46    0.651   -2410.531 
_Irep78_4       341.2531        1631.858        0.21    0.835   -2921.848 
_Irep78_5       876.4049        1740.224        0.50    0.616   -2603.387 
_Iforeign_1     3239.838        859.1453        3.77    0.000   1521.871 
_cons           -32.54137       4097.528        -0.01   0.994   -8226.054 

** want the predicted values by foreign - not specifically interested in 
rep78 but wanted to adjust for it, but I am unsure as to how to treat 
** option 1 - set continuous values to mean but leave rep78 as is
adjust weight turn, by(foreign)
** output
 Car type |             xb
 Domestic |     5164.18
  Foreign |     8390.29

** However, you see that 8390.29-5164.18=3226.11, and not 3239.838 as 
predicted by the model above

** option 2 - treat dummies created by -xi- as continuous, and also set 
them to their mean
adjust weight turn  _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5, by(foreign)

** output
 Car type |             xb
 Domestic |      5160.01
  Foreign |      8399.84

You see that the final -adjust- command gives 8399.84-5160.01=3239.83, as 
given by the regression model above.  So it appears that the second 
treatment of the categorical gives the 'correct' predictions.  However, I 
am struggling to interpret exactly what this means for rep78, and does it 
make sense to set variables that are 0/1 to their mean?

I would be extremely grateful for any assistance with this.

Many thanks,



This message contains privileged and confidential information intended
for the addressee(s) only. If this message was sent to you in error,
you must not disseminate, copy or take any action in reliance on it and
we request that you notify the sender immediately by return email.

Opinions expressed in this message and any attachments are not
necessarily those held by the Health and Safety Laboratory or any person
connected with the organisation, save those by whom the opinions were

Please note that any messages sent or received by the Health and Safety
Laboratory email system may be monitored and stored in an information
retrieval system.
Think before you print - do you really need to print this email?

Scanned by MailMarshal - Marshal's comprehensive email content security
solution. Download a free evaluation of MailMarshal at
*   For searches and help try:

*   For searches and help try:


This message contains privileged and confidential information intended
for the addressee(s) only. If this message was sent to you in error,
you must not disseminate, copy or take any action in reliance on it and
we request that you notify the sender immediately by return email.

Opinions expressed in this message and any attachments are not
necessarily those held by the Health and Safety Laboratory or any person
connected with the organisation, save those by whom the opinions were

Please note that any messages sent or received by the Health and Safety
Laboratory email system may be monitored and stored in an information
retrieval system.
Think before you print - do you really need to print this email?

Scanned by MailMarshal - Marshal's comprehensive email content security
solution. Download a free evaluation of MailMarshal at

*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index