[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -tuples- now available from SSC

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: -tuples- now available from SSC
Date	Sun, 3 Dec 2006 18:48:39 -0000

Thanks to Kit Baum, -tuples- is now available from SSC. 
You need Stata 8 or 9. 

The writing of -tuples- was provoked by various recent 
postings which have touched on cycling through possible 
models using different subsets of predictors. 

Assume the character of the model is decided, except for
precisely which predictors are used out of a given list. 
Given p predictors, each one is in or out, 2 
possibilities for each and thus 2^p overall. If you are 
not interested in the single case of all of them out, 
subtract 1 to get 2^p - 1. 
 
For small p, it can be possible, although possibly not 
-gllamm-orous, to look at all of the candidate models. 

Note that some people doing this are in search of the 
Grail, the "best" model; others are fired by scepticism
and exploring the extent to which quite different models
can appear about equally "good" (or "bad", as the case 
may be). (Insert lengthy discussion about the weasel 
words in quotation marks.) 
 
Some users have offered their versions of how to do 
it, including Paul Millar's -bic- and Alan Feiveson's 
-tryem-, which in essence both cycle through models 
and emit a digest of results. 
 
In contrast, as I offer a new program -tuples- (which is 
itself a modification of -selectvars-, on SSC) I want to 
stress how little it does. You have to do all the 
modelling yourself! In a way it is more like -levelsof- 
than either -tryem- or -bic- (or -allpossible-, another 
beast in the same part of the zoo). 

In general, you give -tuples- a list of items 
and it produces a bundle of local macros in the 
caller's space, naming in turn all the possible singletons, 
all the possible pairs and so forth. Here I feed it -frog toad newt- 
and ask it to show what it is doing:
 
. tuples frog toad newt, display
tuple1: newt
tuple2: toad
tuple3: frog
tuple4: toad newt
tuple5: frog newt
tuple6: frog toad
tuple7: frog toad newt
 
Know that -frog toad newt- were not variables 
in memory. -tuples- tries a list to see if it 
is a varlist, but is happy if it is not. (Conversely 
you can insist on the varlist interpretation, or 
insist that no interpretation as varlist is allowed.) 

For a slightly more serious example, let's us
imagine that our predictors are all car size variables: 

. sysuse auto
(1978 Automobile Data)
 
. tuples head trunk-length displacement, display
tuple1: displacement
tuple2: length
tuple3: weight
tuple4: trunk
tuple5: headroom
tuple6: length displacement
tuple7: weight displacement
tuple8: weight length
tuple9: trunk displacement
tuple10: trunk length
tuple11: trunk weight
tuple12: headroom displacement
tuple13: headroom length
tuple14: headroom weight
tuple15: headroom trunk
tuple16: weight length displacement
tuple17: trunk length displacement
tuple18: trunk weight displacement
tuple19: trunk weight length
tuple20: headroom length displacement
tuple21: headroom weight displacement
tuple22: headroom weight length
tuple23: headroom trunk displacement
tuple24: headroom trunk length
tuple25: headroom trunk weight
tuple26: trunk weight length displacement
tuple27: headroom weight length displacement
tuple28: headroom trunk length displacement
tuple29: headroom trunk weight displacement
tuple30: headroom trunk weight length
tuple31: headroom trunk weight length displacement
 
The results match standard facts from elementary 
combinatorics: 
 
. di comb(5,1) + comb(5,2) + comb(5,3) + comb(5,4) + comb(5,5)
31
 
. di 2^5 - 1
31
 
Now, I suggest, you have the main tool you really need, as
everything else is already possible through standard tools. 
 
Suppose you want to store regression R^2 and the predictor list 
in two variables. 
 
(1) initialise 
 
gen rsq = . 
gen predictors = "" 
 
(2) loop 

quietly forval i = 1/31 { 
 	regress mpg `tuple`i'' 
 	replace rsq = e(r2) in `i' 
 	replace predictors = "`tuple`i''" in `i' 
} 
 
(3) play 
 
Suppose you prefer to put stuff in a file: 
 
(1) initialise 

<set up file> 
 
(2) loop 
 
quietly forval i = 1/31 { 
 	regress mpg `tuple`i'' 
 	<post to file> 
} 
 
(3) play 
 
And so on. You can vary or complicate this arbitrarily, 
varying model commands, etc., etc. Note that the local 
macro tuple0 is not defined by -tuples-, so you can 
exploit its non-existence: 
 
quietly forval i = 0/31 { 
 	regress mpg `tuple`i'' 
 	... 
} 
 
`tuple0' evaluates to an empty string, so you can get 
results for a null model with no predictors if you want. 
Just don't try putting results in observation 0. 
 
The algorithm of -tuples- is more inefficient than 
it could be because I twist results to be sure that 
singletons, pairs, triples, ... are emitted in that 
sequence. Somebody might have a smarter way of doing that. 

-tuples- might have other applications beyond predictor choice. 

If you object to doing powers of 2 in your head, the number 
of tuples produced is itself produced in local macro -ntuples-. 

Some other wrinkles are discussed in the help file. 
 
Nick 
[email protected] 
 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: RE: st: RE: LONEWAY & XTREG
Next by Date: st: trivariate normal distribution (2 dichotomous and one continuous left hand side variables)
Previous by thread: st: Outreg replaces dropped observations in regressions as zero
Next by thread: st: trivariate normal distribution (2 dichotomous and one continuous left hand side variables)
Index(es):
- Date
- Thread