A problem with Dimitry's loop is that it will crash
the first time it hits a string variable. I would
tune it to
foreach v of var * {
qui count if !missing(`v')
if r(N) < 100 drop `v'
}
where 100 is of course a place-holder for your own
desired constant.
-count- remains an under-appreciated command.
Nick
[email protected]
Dimitriy V. Masterov
> There might be a more clever way of doing this, but here's my
> solution:
>
> /* This defines a local named variables that contains a list
> with all variables */
> unab variables: _all
>
> /* This loop drops all variables that have fewer than 100 obs. */
> foreach var in `variables' {
> qui sum `var'
> if r(N)<100 {
> drop `var'
> }
> }
Eric Uslaner
> > I know of Nick Cox's great dropmiss program. I want to do something
> > akin to it (without having to drop each variable individually). Say
> > that a data set has N cases and I want to drop variables
> that have fewer
> > than n nonmissing cases. Theoretically I could generate
> new variables
> > through count, but my data set is already close to the
> maximum allowed
> > without upgrading to SE (which is why I want to drop some
> variables).
> > Is there a way to do this:
> >
> > drop if _N < n
> >
> > or something similar?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/