Your problem is much clearer, but I don't have much comfort for you.
Your best chance of getting things done much faster is with a different hardware set-up. Naturally I have no idea what is possible for you.
If this were my problem, I would be thinking about two quite different strategies:
1. Letting -reshape- chug along and doing something else while I wait. Sometimes one can waste more time finding a better way than you save.
2. Taking a sample of 10000 and seeing if I could get a model even to converge (see Austin's post).
Nick
[email protected]
Supnithadnaporn, Anupit
Martin and Nick, thank you so much. Your suggestion works very well.
I am sorry for being unclear about my question.
I will clarify it now. I have a large dataset around 1 million records. There are
around 20 variables. I would like to run the clogit regression, which requires me
to reshape the data into the long form. Basically, it is the model of a person
choosing a product from the choice set of 12. Thus, the total records would be of
12 million after reshaping.
In the beginning, I tried with the smaller sample and reshape worked very slow.
Then, I thought I should start with only 2 ID variables: person ID and choice ID (1-12).
I created the wide form data of only 2 ID variables, reshaped it to the long form, and
lastly merged other variables that are associate with person and choice respectively.
However, even with the only 2 ID variables, it took a long time for reshape to finish
for the small sample of the total 1 million records. That is why I try to find the
faster way to create the empty dataset with 2 ID variables first. Then my next step
is to merge the information about a person and the choice.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/