Hello All!
I have a large dataset (to be specific ~ 1mln observations, 600MB).
I need to (repeatedly) save several small portions of it (small can be
as small as 1 observation) into separate files.
So far it is done similarly to this
preserve
keep if Needed1
save "Portion1"
restore
preserve
keep if Needed2
save "Portion2"
restore
... etc ...
where variables Needed1 and Needed2 are dummies generated earlier in the code.
This works. But it is painfully slow.
The problem is that it will necessarily have to preserve/restore the
whole large dataset.
-save- does not support -if- and -in- modifiers, otherwise my ideal
choice would be:
save "Portion1" if Needed1
save "Portion2" if Needed2
As an alternative I was thinking of saving the dataset directly (by
generating Stata file byte-by-byte), but since I need labels to be
preserved together with the data, this becomes more tricky, and
reinventing what is already [well] done, does not sound like a good
idea.
To pose a specific question: how to save one observation 1<=K<=_N
(with labels) to a Stata file, without having to save the whole
dataset?
Version of Stata: Stata 10/ Windows
Thank you,
Sergiy Radyakin
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/