Following up on Friedrich`s reply, it is easy enough to apply his idea to
non-string variables
**********
prog tenpercent
vers 10.1
args varname
count if `1'!=.
local n = r(N)
gen new`1' = 0 if `1'!=.
levelsof `1', local(levels)
foreach l of local levels {
count if `1'==`l'
replace new`1' = 1 if `1'==`l' & r(N)/`n' > 0.1
}
end
**********
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Friedrich Huebler
Sent: Thursday, September 25, 2008 4:09 AM
To: [email protected]
Subject: Re: st: Marking Levels of Categorical Variable
Is your variable in string or numeric format? The following example
assumes string format. Missing values are excluded from the analysis.
count if var!=""
local n = r(N)
gen newvar = 0 if var!=""
levelsof var, local(levels)
foreach l of local levels {
count if var=="`l'"
replace newvar = 1 if var=="`l'" & r(N)/`n' > 0.1
}
Friedrich
On Wed, Sep 24, 2008 at 9:14 PM, <[email protected]> wrote:
> I have a categorical variable with 30 levels. How do I create a variable
> that is equal to 1 if a category of the variable shows up more than 10% of
> the time.
>
> For example:
> var Percent
> A 5
> B 5
> C 10
> D 20
> E 60
> How would I create "newvar" equal to 1 for C, D, and E and equal to 0 for
A
> and B?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/