Jun Xu posted twice, here labelled (1) and (2):
> (1)
>
> > puzzled by a problem when writing an ado file. Suppose I have
> > var1 var2 var3 var4 var5......vark, and I want to do the
> > following loop:
> > ************************************************************
> > *********
> > var1
> > var2
> > var3
> > ...
> > ...
> > ...
> > vark
> > var1 var2
> > var1 var3
> > var1 var4
> > ...
> > ...
> > var1 vark
> > var2 var3
> > var2 var4
> > ...
> > ...
> > var(k-1) vark
> > ...
> > ...
> > var1 var2 var3
> > var1 var2 var4
> > var1 var2 var5
> > ...
> > var1 var2 var3....vark
> > ******************************************************************
> > Basically, what I want to do is like step wise exhausting
> > all combinations
> > in a systematic way from univariate, bivariate, trivariate, to
> > multivariate....Or, I can say for every variable in the
> > variable list, there
> > is indicator variable associated with it. I either take
> > this variable in or
> > out in each run. And there should be 2^k possibilities. I
> > have no idea how
> > to handle that. COuld anyone give me some hint? Many
> > thanks in advance.
>
> (2)
>
> I think I might not have explained my problems clearly. I
> have k indicator
> variables (coded as 1 or 0) and I would like to know the
> response patterns
> (for example for latent class analsis) to these k
> variables. For example,
>
> var1 var2 var3 ....vark
> 1 0 0 0
> 0 1 0 0
> ...
> 1 1 0 0
> ...
> ...
> ...
> 1 1 1 1
>
> I would like to know for each response pattern, how many
> cases are there,
> and programmed into an ado file. My key problem here is
> how to run through
> all the combinations (univariate, bivariate, and trivariate)
>
> One posibility is that I used the following cods (or
> reviced version to fit
> into an ado file)
>
> ******************************************************************
> clear
> for num 1/6: set obs 100\ gen xX=invnorm(uniform()) \ gen DxX=xX>0.6
> gen pattern=0
> local i=1
> while `i'<6 {
> replace pattern=pattern+Dx`i'*10^(6-`i')
> local i=`i'+1
> }
>
> aorder
> list Dx1-Dx6 pattern
> sort pattern
> list pattern
> gen count=1
> collapse (sum) count, by(pattern)
> ***********************************************************
> The resulting data matrix looks like:
>
> ============================
> pattern count
> 0 16
> 10 10
> 100 5
> 110 6
> 1000 8
> 1010 7
> 1100 2
> 1110 1
> 10000 11
> 10010 3
> 10100 2
> 10110 2
> 11000 2
> 11010 2
> 11100 1
> 100000 7
> 100010 1
> 100100 2
> 101000 4
> 101010 1
> 110000 1
> 110010 2
> 110100 2
> 111000 1
> 111010 1
> =================================
>
>
> Here the problem is that it only presents the response
> pattern that has at
> least one case and it's hard to handle its order (now is
> list in numerical
> order: from small to big)
> But what if I want to go through "each" combination (2^k
> possible ways) in a
> sysmatic way and list all response pattern freqeuncy though
> some of them
> have zero cases. What I meant by a systematic way is like:
>
> ************************************************************
> *********
> var1
> var2
> var3
> ...
> ...
> ...
> vark
> var1 var2
> var1 var3
> var1 var4
> ...
> ...
> var1 vark
> var2 var3
> var2 var4
> ...
> ...
> var(k-1) vark
> ...
> ...
> var1 var2 var3
> var1 var2 var4
> var1 var2 var5
> ...
> var1 var2 var3....vark
> ******************************************************************
>
> or in binary coding
> ****************************************************************
> 1 0 0 0 0 .....0
> 0 1 0 0 0 .....0
> 0 0 1 0 0 .....0
> ...
> ...
> ...
> 0 0 0 0 0 .....1
> 1 1 0 0 0 .....0
> 1 0 1 0 0 .....0
> 1 0 0 1 0 .....0
> ...
> ...
> 1 1 1 0 0 .....0
> 1 0 1 1 0 .....0
> ...
> ...
> ...
> ...
> 1 1 1 1 1 .....1
> ***********************************************
>
> Here I didn't present some summarize command that could
> grab case number for
> that response pattern. But basically I will run through
> each combination
> and calculate the frequency for that particular combination
> though there
> might be zero cases. Thanks a lot
1. To get a tabulation of patterns with some instances,
egen all = concat(var1-vark)
tab all
2. The following program suggests some possible lines of attack.
program permlist, rclass
version 8
syntax varlist
tokenize `varlist'
local nvars : word count `varlist'
local imax = 2^`nvars' - 1
forval i = 1 / `imax' {
qui inbase 2 `i'
local which : di %0`nvars'.0f `r(base)'
local vars
forval j = 1 / `nvars' {
local char = substr("`which'",`j',1)
if `char' {
local vars "`vars'``j'' "
}
}
local vlist `"`vlist'"`vars'" "'
}
local varlist
forval i = 1 / `nvars' {
foreach w of local vlist {
local nv : word count `w'
if `i' == `nv' {
local varlist `"`varlist'"`w'" "'
}
}
}
return local varlist `"`varlist'"'
end
I use the undocumented -inbase- command
to get the binary equivalent of 1 ... 2^k - 1 (I omit the
null case in which none of the variables are chosen).
It is important to get leading zeros explicit.
-inbase- is in Stata 8; for Stata 7 or Stata 6 type
. findit inbase
or use the search method of your choice
to find it in Bill Gould's files. In Stata
. type http://www.stata.com/users/wgould/inbase/inbase.ado
Then each variable is or is not chosen according
to whether each digit is 1 or 0.
Then we need to sort for your purposes according
to the number of variables chosen.
The whole list is left behind in memory
in the form (e.g. for a b c d)
"d " "c " "b " "a " ... "a b c d "
I think the above program should also
work with very minor modifications in Stata 7.
3. For implementation of a different, and less
general, technique see -allpossible- on SSC.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/