Dear Michael,
thank you very much for your suggestions. Just as you wrote, the first
one saves about half the time needed and is a good improvement. The
second one is a bit complicated, since I don't immediately see how the
labels can be declared/saved with this approach.
So, I am thinking about saving the labels first with -label save-,
then dumping the data into several files with -post-, then open them
one-by-one and apply the saved labels and resave. Would that be the
fastest way to do it?
Thank you,
Sergiy Radyakin
On 5/28/08, Michael Blasnik <[email protected]> wrote:
> ...
>
> I have two suggestions that may be worth exploring:
>
> 1) use -restore, preserve- instead of -restore- and you will save the time
> required to preserve the dataset next time.
>
> 2) a little more tricky, but you could employ -post- to post an observation
> to a dataset. I'm not sure how much time this would save but it may be
> worth a try.
>
> Michael Blasnik
>
>
> ----- Original Message ----- From: "Sergiy Radyakin"
> <[email protected]>
> To: <[email protected]>
> Sent: Wednesday, May 28, 2008 6:31 PM
> Subject: st: Saving 1 observation
>
>
> > Hello All!
> >
> > I have a large dataset (to be specific ~ 1mln observations, 600MB).
> >
> > I need to (repeatedly) save several small portions of it (small can be
> > as small as 1 observation) into separate files.
> >
> > So far it is done similarly to this
> >
> > preserve
> > keep if Needed1
> > save "Portion1"
> > restore
> >
> > preserve
> > keep if Needed2
> > save "Portion2"
> > restore
> >
> > ... etc ...
> >
> > where variables Needed1 and Needed2 are dummies generated earlier in the
> code.
> >
> > This works. But it is painfully slow.
> >
> > The problem is that it will necessarily have to preserve/restore the
> > whole large dataset.
> > -save- does not support -if- and -in- modifiers, otherwise my ideal
> > choice would be:
> >
> > save "Portion1" if Needed1
> > save "Portion2" if Needed2
> >
> > As an alternative I was thinking of saving the dataset directly (by
> > generating Stata file byte-by-byte), but since I need labels to be
> > preserved together with the data, this becomes more tricky, and
> > reinventing what is already [well] done, does not sound like a good
> > idea.
> >
> > To pose a specific question: how to save one observation 1<=K<=_N
> > (with labels) to a Stata file, without having to save the whole
> > dataset?
> >
> > Version of Stata: Stata 10/ Windows
> >
> > Thank you,
> > Sergiy Radyakin
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/