Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: creating combinations of all 49 variables and counting their frequencies


From   Jacob Model <jacob.model@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: creating combinations of all 49 variables and counting their frequencies
Date   Sat, 1 Mar 2014 21:57:01 -0800

Scratch that last part. The network thing wouldn't work - it was a dumb idea.

On Sat, Mar 1, 2014 at 9:50 PM, Jacob Model <jacob.model@gmail.com> wrote:
> So I think there's a real question if order matters for your
> combinations. So let's say A, B, and C are different adoptions. Is the
> combination ABC equivalent to BAC?
>
> Some folks at stackoverflow have talked about this... here's the links
> http://stackoverflow.com/questions/3467914/is-there-an-algorithm-to-generate-all-unique-circular-permutations-of-a-multiset
>
> And here's a CS theoretical paper which talks about how you might
> implement an algorithm to do this.
> http://www.cis.uoguelph.ca/~sawada/papers/alph.pdf
> http://www.sciencedirect.com/science/article/pii/S0196677400911088
>
> My guess (as an amateur programmer) is that you probably could write
> some algorithm that would take into account that combinations can be
> deconstructed as subsets of each other and you'll automatically know
> they're 0. So if you're working from the bottom up... if you already
> know that all AB is zero and all CD is zero... by construction ABCD
> will have zero frequency. So you wouldn't have to compute any
> combination that contained AB or CD.
>
> Another way of thinking about this may be in a network framework with
> each adoption pair being a tie between nodes. So if you had A and B
> adopt you could think of them having a tie. If A, B and C adopted it
> would be a triad. Etc. The advantage with this is you could store it
> as an edgelist, which is pretty efficient. In other words, you could
> put in every observation of groups of two features - a much more
> manageable number - and create a resulting database that could tell
> you the frequency of larger combinations.
>
> -Jacob
>
> On Sat, Mar 1, 2014 at 9:09 PM, Krisha Lim <krisha.lim@ualberta.ca> wrote:
>> Hi,
>>
>> I have 49 binary variables. I am interested in doing all combinations for those 49 variables and calculating the frequencies. I am not sure how to do this in STATA. The tuples command just generates all the tuples but it stopped after the 9999999 tuples. Would you be able to help me?
>>
>> To give a context, each binary variable indicates adoption (so 1= adopt). I want to figure out the most used technique or combination of techniques used in my dataset. I know this will be a very very large number, but hope there's a way to do it.
>>
>> Thanks!
>>
>> Krisha
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index