| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: processing time
At 12:55 PM 3/22/2007, Jon Schwabish wrote:
Which is more efficient (in terms of processing time)?
drop if a==.
drop if b==.
OR
drop if a==. | b==.
I would think that the latter is more efficient, especially with
large datasets. You incur the cost of parsing and executing a command
once, rather than twice (though the expression is more complex, but I
don't suppose that matters much). Furthermore, the latter may be
especially more efficient if there are many cases with b==. that do
not have a==. . The reason is that when you drop observations, there
is, I suppose, a moving of records to close up the holes. With the
two-command method, some records will be moved twice, rather than once.
I suppose it makes little difference for small datasets.
You can also -set rmsg on-, and run some experiments.
Finally, be aware that a==. is not the general way to test for
missing value; that will test for equality with one specific missing
value. The way to test for missing values in general is mi(a) or
a>=. . The method of mi(a) is even more general in that it works for
string types as well.
HTH
--David
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/