Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Jacob Model <jacob.model@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: creating combinations of all 49 variables and counting their frequencies |
Date | Sat, 1 Mar 2014 21:57:01 -0800 |
Scratch that last part. The network thing wouldn't work - it was a dumb idea. On Sat, Mar 1, 2014 at 9:50 PM, Jacob Model <jacob.model@gmail.com> wrote: > So I think there's a real question if order matters for your > combinations. So let's say A, B, and C are different adoptions. Is the > combination ABC equivalent to BAC? > > Some folks at stackoverflow have talked about this... here's the links > http://stackoverflow.com/questions/3467914/is-there-an-algorithm-to-generate-all-unique-circular-permutations-of-a-multiset > > And here's a CS theoretical paper which talks about how you might > implement an algorithm to do this. > http://www.cis.uoguelph.ca/~sawada/papers/alph.pdf > http://www.sciencedirect.com/science/article/pii/S0196677400911088 > > My guess (as an amateur programmer) is that you probably could write > some algorithm that would take into account that combinations can be > deconstructed as subsets of each other and you'll automatically know > they're 0. So if you're working from the bottom up... if you already > know that all AB is zero and all CD is zero... by construction ABCD > will have zero frequency. So you wouldn't have to compute any > combination that contained AB or CD. > > Another way of thinking about this may be in a network framework with > each adoption pair being a tie between nodes. So if you had A and B > adopt you could think of them having a tie. If A, B and C adopted it > would be a triad. Etc. The advantage with this is you could store it > as an edgelist, which is pretty efficient. In other words, you could > put in every observation of groups of two features - a much more > manageable number - and create a resulting database that could tell > you the frequency of larger combinations. > > -Jacob > > On Sat, Mar 1, 2014 at 9:09 PM, Krisha Lim <krisha.lim@ualberta.ca> wrote: >> Hi, >> >> I have 49 binary variables. I am interested in doing all combinations for those 49 variables and calculating the frequencies. I am not sure how to do this in STATA. The tuples command just generates all the tuples but it stopped after the 9999999 tuples. Would you be able to help me? >> >> To give a context, each binary variable indicates adoption (so 1= adopt). I want to figure out the most used technique or combination of techniques used in my dataset. I know this will be a very very large number, but hope there's a way to do it. >> >> Thanks! >> >> Krisha >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/