Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: generate
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: generate
Date
Thu, 7 Oct 2010 12:17:45 +0100
It's the same data, wide or long. Which limit, observations or variables, do you imagine will bite first? Look at -help limits- for your version of Stata (not stated here).
Before you replied, I was going to reinforce Dimitriy's advice. I would reach for -reshape- in this instance and I would keep the data in long form, at least on the information you have given.
In a concurrent thread, I have commented:
Some things are easier with a wide structure but most things are easier otherwise.
There is much more discussion in
SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
Q1/09 SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced
Although that column shows that you can do many things rowwise, the underlying theme is that it isn't usually trivial.
Nick
[email protected]
Mirriam Gee
Thank you very much Dimitry for your suggestion. It worked perfectly
well but my main worry is I have many hid (30000) and many g
variables( eventually i will work with over 2000 variables), so i will
end up having memory limitation problems if I use reshape command.
Unless of course if I also divide my dataset into smaller groups.
On Wed, Oct 6, 2010 at 10:55 PM, Dimitriy V. Masterov
> Mirriam Gee wants to:
>> generate new variable(s) X1- X20 which contains the first 20
>> numbers ( excluding the zeros) from g1- g100?. For example:
>
> There's probably a more elegant way of doing this, but this can be
> accomplished with the -reshape- command to make your data easier to
> work with, and then reshaping it again to get it like you want it for
> your analysis. First, preserve the data and then reshape long to get
> the X variable. Then, reshape wide and save the X variables. Restore
> the G variables data, and merge the Xs back in with the Gs:
>
> #delimit;
> /* Preserve your data */
> preserve;
>
> /* Preserve your data */
> preserve;
>
> /* Create the x variables with 2 reshapes */
> keep hid g*;
> reshape long g, i(hid) j(which_g);
>
> drop if g==0;
> rename g x;
> bys hid: gen t=_n;
> drop which_g;
>
> reshape wide x, i(hid) j(t);
>
> tempfile temp;
> save "`temp'";
>
> /* Restore data */
> restore;
>
> /* Merge the x variables with the g variables */
> merge 1:1 hid using "`temp'";
> drop x21-_merge;
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/