Dear Stata users,
I have a question related to data management. I have a fairly large
data set in the following format:
id asc1 asc2 asc3........
___________________________________
1 1 0 0
1 0 1 0
---------------------------------
2 0 1 0
2 1 0 0
2 0 0 1
----------------------------------
:
:
where id is the identifier and asc1, asc2, etc are associations
related to specific ids.
I would like to put it in this format eventually.
id asc1 asc2 asc3........
___________________________________
1 1 1 0
2 1 1 1
:
:
My plan was to use reshape wide for which I needed it to first look like this:
id asc1 asc2 asc3........
___________________________________
1 1 1 0
1 1 1 0
---------------------------------
2 1 1 1
2 1 1 1
2 1 1 1
----------------------------------
:
:
That is if ever a particular id is associated with any asc, that
column is 1 for all occurrence of that particular id.
This could probably be done with.....
bysort id : g byte assc1 = sum(asc1)
or
collapse (sum) asc1-asc2138 , by (id)
But my problem is that there are 2138 asc (i.e. last var is asc2138)
[and not enough memory (see below) for collapse], so I want to
automate this. So I tried to do a loop like:
egen same = group(id)
forvalues i =1/_N{
local j = 1
while `j'=same{
g ascc`j' =1
continue
local j = `j'+1
}
}
But this just doesn't work - invalid syntax (using Stata 10). Any
pointers (either for fixing this loop or the original problem) would
be greatly appreciated.
Thanks,
Nikhil
query mem
Current memory allocation
current memory usage
settable value description (1M = 1024k)
--------------------------------------------------------------------
set maxvar 5000 max. variables allowed 1.909M
set memory 745M max. data space 745.000M
set matsize 400 max. RHS vars in models 1.254M
-----------
748.163M
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/