|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Re: Saving 1 observation
An imperfect solution might be to use -outsheet-, which
allows the -if- qualifier. First, save your labels:
label save using mylabels, replace
forv i=1/N {
outsheet using Portion`i'.csv if Needed`i', replace
}
clear
forv i=1/N {
insheet using Portion`i'.csv, clear
do mylabels
save Portion`i', replace
}
The first loop saves the fragmentary datasets as CSV files;
the second reads them in, applies the labels, and saves them
as Stata files.
It is clumsy, but I think will be much faster than your
current solution.
hth,
Jeph
Sergiy Radyakin wrote:
Dear Michael,
thank you very much for your suggestions. Just as you wrote, the first
one saves about half the time needed and is a good improvement. The
second one is a bit complicated, since I don't immediately see how the
labels can be declared/saved with this approach.
So, I am thinking about saving the labels first with -label save-,
then dumping the data into several files with -post-, then open them
one-by-one and apply the saved labels and resave. Would that be the
fastest way to do it?
Thank you,
Sergiy Radyakin
On 5/28/08, Michael Blasnik <[email protected]> wrote:
...
I have two suggestions that may be worth exploring:
1) use -restore, preserve- instead of -restore- and you will save the time
required to preserve the dataset next time.
2) a little more tricky, but you could employ -post- to post an observation
to a dataset. I'm not sure how much time this would save but it may be
worth a try.
Michael Blasnik
----- Original Message ----- From: "Sergiy Radyakin"
<[email protected]>
To: <[email protected]>
Sent: Wednesday, May 28, 2008 6:31 PM
Subject: st: Saving 1 observation
Hello All!
I have a large dataset (to be specific ~ 1mln observations, 600MB).
I need to (repeatedly) save several small portions of it (small can be
as small as 1 observation) into separate files.
So far it is done similarly to this
preserve
keep if Needed1
save "Portion1"
restore
preserve
keep if Needed2
save "Portion2"
restore
... etc ...
where variables Needed1 and Needed2 are dummies generated earlier in the
code.
This works. But it is painfully slow.
The problem is that it will necessarily have to preserve/restore the
whole large dataset.
-save- does not support -if- and -in- modifiers, otherwise my ideal
choice would be:
save "Portion1" if Needed1
save "Portion2" if Needed2
As an alternative I was thinking of saving the dataset directly (by
generating Stata file byte-by-byte), but since I need labels to be
preserved together with the data, this becomes more tricky, and
reinventing what is already [well] done, does not sound like a good
idea.
To pose a specific question: how to save one observation 1<=K<=_N
(with labels) to a Stata file, without having to save the whole
dataset?
Version of Stata: Stata 10/ Windows
Thank you,
Sergiy Radyakin
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/