Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Factor variable notation vs. hand made dummy vars
From
"Lachenbruch, Peter" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: RE: Factor variable notation vs. hand made dummy vars
Date
Mon, 6 Feb 2012 08:09:29 -0800
it looks like you have two cells (1 & 2) that predict failure perfectly
. tab for rep78
| Repair Record 1978
Car type | 1 2 3 4 5 | Total
-----------+-------------------------------------------------------+----------
Domestic | 2 8 27 9 2 | 48
Foreign | 0 0 3 9 9 | 21
-----------+-------------------------------------------------------+----------
Total | 2 8 30 18 11 | 69
If i use logit for mpg i.rep78, nolog i get
. logit for mpg i.rep78,nolog
note: 1.rep78 != 0 predicts failure perfectly
1.rep78 dropped and 2 obs not used
note: 2.rep78 != 0 predicts failure perfectly
2.rep78 dropped and 8 obs not used
note: 5.rep78 omitted because of collinearity
Logistic regression Number of obs = 59
LR chi2(3) = 25.87
Prob > chi2 = 0.0000
Log likelihood = -25.478287 Pseudo R2 = 0.3367
------------------------------------------------------------------------------
foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | 0.131 0.071 1.85 0.064 -0.008 0.270
|
rep78 |
1 | 0.000 (empty)
2 | 0.000 (empty)
3 | -3.136 1.045 -3.00 0.003 -5.184 -1.089
4 | -1.120 0.974 -1.15 0.250 -3.029 0.789
5 | 0.000 (omitted)
|
_cons | -1.723 1.776 -0.97 0.332 -5.205 1.759
------------------------------------------------------------------------------
Then the fifth category becomes the intercept.
. logit for mpg d1-d5,nolog
note: d1 != 0 predicts failure perfectly
d1 dropped and 2 obs not used
note: d2 != 0 predicts failure perfectly
d2 dropped and 8 obs not used
note: d5 omitted because of collinearity
Logistic regression Number of obs = 59
LR chi2(3) = 25.87
Prob > chi2 = 0.0000
Log likelihood = -25.478287 Pseudo R2 = 0.3367
------------------------------------------------------------------------------
foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | 0.131 0.071 1.85 0.064 -0.008 0.270
d1 | 0.000 (omitted)
d2 | 0.000 (omitted)
d3 | -3.136 1.045 -3.00 0.003 -5.184 -1.089
d4 | -1.120 0.974 -1.15 0.250 -3.029 0.789
d5 | 0.000 (omitted)
_cons | -1.723 1.776 -0.97 0.332 -5.205 1.759
------------------------------------------------------------------------------
________________________________________
From: [email protected] [[email protected]] On Behalf Of Ulrich Kohler [[email protected]]
Sent: Monday, February 06, 2012 7:25 AM
To: [email protected]
Subject: st: Factor variable notation vs. hand made dummy vars
Hi all,
I cannot replicate the model
. sysuse auto, clear
. tab rep78, gen(d)
. logit for mpg d2-d5
with factor variable notation. I tried
. logit for mpg ib1.rep78
but results differ. Can anybody explain why?
(Note as an aside that
. logit for mpg d1-d5
reproduces the factor variables solution, but normally I would not
specify the model this way)
Update status
Last check for updates: 06 Feb 2012
New update available: none (as of 06 Feb 2012)
Current update level: 30 Jan 2012 (what's new)
Uli
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/