Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: St: Dropping variables with mostly missing values
From
Jeph Herrin <[email protected]>
To
[email protected]
Subject
Re: st: St: Dropping variables with mostly missing values
Date
Fri, 07 Feb 2014 15:40:57 -0500
To drop all variables missing more than 80% of the time:
foreach V of varlist _all {
count if !mi(`V')
drop if r(N)/_N < 0.2
}
This works for string and numeric variables. Change 0.2 to whatever
level you want.
hth,
Jeph
On 2/7/2014 3:11 PM, Eric M. Uslaner wrote:
I know that this has been discussed before, but a long search doesn't find a solution for me (my own fault in searching, most likely).
I have a data set (not my own) with 161 cases over a long time period. But most of the variables are largely made up of missing values (information wasn't available a long time ago). I have used Nick Cox's dropmiss (from SSC) to drop variables with all missing values. But a large number of variables remain with few observations. I would like to delete any variable with fewer than 20 cases. But I can't figure out how to do this (especially since I have a large number of variables, most of which have very few cases). Any help would be appreciated.
Ric Uslaner
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/