I am not sure this is correct. In the 2 variable cross product case, the
variable T would be equivalent to T1, variable A would be equal to A1, and
the cross product TxA would be equal to T1A1. With these three coefficients
and the constant all 4 combinations can be accounted for.
T = 0, A = 0 would be the constant
T = 0, A = 1 would be A plus the constant
T = 1, A = 0 would be T plus the constant
T = 1, A = 1 would be TXA plus the constant
For the 3 variable case, in the program below, the reference category for
the cross product method is T = 0, A = 0, and B= 0; which will be equal to
the constant. At the end of the program, a table is produced showing all 8
combinations of the categorical variables for both the "all dummy" and
"cross product" method. The results are identical for both methods.
+---------------------------------------------------+
| categories all_dummies cross_product |
|---------------------------------------------------|
| T = 0, A = 0, B = 0 3424.737 3424.737 |
| T = 0, A = 0, B = 1 4062.308 4062.308 |
| T = 0, A = 1, B = 0 2730.5 2730.5 |
| T = 0, A = 1, B = 1 3368.071 3368.071 |
|---------------------------------------------------|
| T = 1, A = 0, B = 0 2372 2372 |
| T = 1, A = 0, B = 1 3420 3420 |
| T = 1, A = 1, B = 0 1991.667 1991.667 |
| T = 1, A = 1, B = 1 3039.667 3039.667 |
+---------------------------------------------------+
Scott
----------------------------------------------------------
sysuse auto, clear
qui {
gen t = fore
mark a if price <4500
mark b if mpg <17
tab t, gen(t)
tab a, gen(a)
tab b , gen(b)
gen t0a0 = t1*a1
gen t1a0 =t2*a1
gen t0a1 = t1*a2
gen t1a1 = t2*a2
gen t0b0= t1*b1
gen t1b0 = t2*b1
gen t0b1 = t1*b2
gen t1b1 = t2*b2
xi i.t*i.a i.t*i.b
reg weight t0a0-t1b1, nocon nohead
gen all_dummies = .
local n = 1
forv i = 0/1 {
forv j = 0/1 {
forv k = 0/1 {
replace all_dummies = _b[t`i'a`j'] + _b[t`i'b`k'] in `n'
local n = `n' + 1
}
}
}
gen categories = ""
local n = 1
forv i = 0/1 {
forv j = 0/1 {
forv k = 0/1 {
replace cate = "T = `i', A = `j', B = `k'" in `n'
local n = `n' + 1
}
}
}
reg weight _I* , nohead
gen cross_product = .
replace cross_product = _b[_cons] in 1
replace cross_product = _b[_cons] + _b[_Ib_1] in 2
replace cross_product = _b[_cons] + _b[_Ia_1] in 3
replace cross_product = _b[_cons] + _b[_Ib_1]+ _b[_Ia_1] in 4
replace cross_product = _b[_cons] + _b[_It_1] in 5
replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ib_1] +_b[_ItXb_1_1] in 6
replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ia_1] +_b[_ItXa_1_1] in 7
replace cross_product = _b[_cons] + _b[_It_1]+ _b[_Ia_1]+ _b[_Ib_1] ///
+ _b[_ItXa_1_1] + _b[_ItXb_1_1] in 8
}
l cate all_dummies cross_product in 1/8, noobs abb(32) sep(4)
> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of "Laplante, Beno�t"
> Sent: Thursday, June 16, 2005 3:22 PM
> To: [email protected]
> Subject: st: -xi- and interactions (was:your stata query)
>
> The recent postings on -xi- reminded me of the following riddle about
> interactions.
>
> Say you have two variables, T and A. Each has two values, low and high,
> coded 0 and 1 respectively and for both variables. You assume that the
> effect of A varies across the values of T, which is a very basic
> definition of what an interaction is about. There are (at least) two ways
> to deal with the problem.
>
> The first one is simply to build dummies that represent all the
> combinations of values of the two variables and put all of them, minus
> one, in the equation. So you would have T0A0, T0A1, T1A0 and T1A1. The
> most obvious choice would be to exclude T0A0, but excluding any of the
> four variables would provide equivalent results. So you would use three
> variables to represent your interaction, say T0A1, T1A0 and T1A1.
>
> Another method, the one used by -xi-is to compute cross products of the
> two variables. The procedure gives you an equation in which you are using
> three variables, two of them labelled as the original variables and the
> third one as their product. So you are using variables labelled T, A and
> TA. Given the original coding scheme and the use of a cross-product, T is
> actually equivalent to T1A0, A to T0A1 and TA to T1A1-(T0A1+T1A0).
>
> Both methods will produce equivalent results.
>
> Now let's say that you are adding a second variable to your equation, B,
> whose effect is also assumed to vary across the values of T. Using the
> first method, you would build an equation containing 6 terms, that is
> three out of T0A0, T0A1, T1A0 and T1B1, and three out of T0B0, T0B1, T1B0
> and T1B1.
>
> If you were to use the second method, you would be using 5 terms: T, A,
> TA, B, TB. The fit of the two models are not the same and I never found a
> reference that dealt with how two such different models could be
> equivalent.
>
> Actually, the second one seems to be equivalent to an equation in which,
> using T0A0 and T0B0 as reference categories, T=(T1A0+T1B0)/2, which would
> be an unwanted assumption in most cases.
>
> Anyone has an answer?
>
> Beno�t Laplante, professeur
> Universit� du Qu�bec
> Institut National de la Recherche Scientifique
> Urbanisation, Culture et Soci�t�
> http://www.inrs-ucs.uquebec.ca/default.asp?p=lapl
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/