Is your variable in string or numeric format? The following example
assumes string format. Missing values are excluded from the analysis.
count if var!=""
local n = r(N)
gen newvar = 0 if var!=""
levelsof var, local(levels)
foreach l of local levels {
count if var=="`l'"
replace newvar = 1 if var=="`l'" & r(N)/`n' > 0.1
}
Friedrich
On Wed, Sep 24, 2008 at 9:14 PM, <[email protected]> wrote:
> I have a categorical variable with 30 levels. How do I create a variable
> that is equal to 1 if a category of the variable shows up more than 10% of
> the time.
>
> For example:
> var Percent
> A 5
> B 5
> C 10
> D 20
> E 60
> How would I create "newvar" equal to 1 for C, D, and E and equal to 0 for A
> and B?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/