I agree with Tony. Making things worse, the interactions which
contain the same variables will be correlated, making it difficult to
conclude much about single interactions. If you have no specific
hypotheses, you must also apply the Bonferonni or similar correction
to conclusions about thel 30 original variables. Quite likely some of
thee are not needed--so why are you testing all the interaction terms?
If you jackknife or bootstrap the entire procedure, you are likely to
find that the set of "significant" interactions is not stable.
Another likely error is that the linear form of some of the 30
original variables is not correct. And multicollinearity among so
many variables is possible, with its attendant difficulties. For
guidance on how to proceed, read Frank Harrell, Regression modeling
strategies, Springer, New York 2001.
-Steve
On Thu, Sep 10, 2009 at 11:55 AM, Lachenbruch, Peter
<[email protected]> wrote:
> This seems like data snooping. You will have 435 products. Have you
> considered how you will use these? If you use a Bonferroni on
> significance tests, you will need to be less than 0.00011 (.05/435) to
> even get a little excited. You might consider defining some specific
> interactions of interest.
>
> Tony
>
> Peter A. Lachenbruch
> Department of Public Health
> Oregon State University
> Corvallis, OR 97330
> Phone: 541-737-3832
> FAX: 541-737-4001
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Jean-Gael
> Collomb
> Sent: Wednesday, September 09, 2009 8:13 AM
> To: [email protected]
> Subject: st: Creating interaction terms of 30 continuous variables
>
> Hello,
> I am trying to explore all possible interaction terms in my dataset
> and I am not sure how to do this efficiently using stata. I have about
> 30 continuous variables. It seems that xi is limited to non
> continuous variables and xi3 does not seem to work for 30 variables,
> unless I am doing something wrong. I came up with a solution using
> "foreach" to generate all the interactions terms and then try them in
> a regression model:
>
> foreach x of varlist var1-var30 {
> foreach y of varlist var1-var30 {
> generate `x'X`y'=`x'*`y'
> }
> }
>
> The first problem I have is that this created duplicates (i.e.
> var1Xvar2 is the same as var2Xvar1). Furthermore, I feel it is a
> cumbersome way to do it, and I wonder if there is a more efficient way
> to generate all possible pair interaction terms or better yet to have
> an exploratory regression models testing all these pairs out and
> selecting the best model. I saw someone someone doing something
> similar very quickly in NCSS and I was hoping I could do the same in
> STATA.
> Thanks for your feedback.
>
>
> Jean-Gael "JG" E. Collomb
> PhD candidate
> School of Natural Resources and Environment / School of Forest
> Resources and Conservation
> University of Florida
> [email protected]
> [email protected]
> +1 (352) 870 6696
>
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Steven Samuels
[email protected]
18 Cantine's Island
Saugerties NY 12477
USA
845-246-0774
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/