RE: st: RE: RE: Use foreach or forvalues to create the long form data

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: RE: Use foreach or forvalues to create the long form data
Date   Thu, 16 Oct 2008 16:50:09 +0100

Your problem is much clearer, but I don't have much comfort for you. 

Your best chance of getting things done much faster is with a different hardware set-up. Naturally I have no idea what is possible for you. 

If this were my problem, I would be thinking about two quite different strategies: 

1. Letting -reshape- chug along and doing something else while I wait. Sometimes one can waste more time finding a better way than you save. 

2. Taking a sample of 10000 and seeing if I could get a model even to converge (see Austin's post). 

[email protected] 

Supnithadnaporn, Anupit

Martin and Nick, thank you so much. Your suggestion works very well.

I am sorry for being unclear about my question.
I will clarify it now. I have a large dataset around 1 million records. There are
around 20 variables. I would like to run the clogit regression, which requires me
to reshape the data into the long form. Basically, it is the model of a person
choosing a product from the choice set of 12. Thus, the total records  would be of
12 million after reshaping.

In the beginning, I tried with the smaller sample and reshape worked very slow.
Then, I thought I should start with only 2 ID variables: person ID and choice ID (1-12).
I created the wide form data of only 2 ID variables, reshaped it to the long form, and
lastly merged other variables that are associate with person and choice respectively.

However, even with the only 2 ID variables, it took a long time for reshape to finish
for the small sample of the total 1 million records. That is why I try to find the 
faster way to create the empty dataset with 2 ID variables first. Then my next step
is to merge the information about a person and the choice.

