The frequencies of -type- can be put
in -typ_freq- by
bysort type : gen typ_freq = _N
If you want to scale to relative frequencies
(sum 1) then you can
su typ_freq, meanonly
replace typ_freq = typ_freq / r(sum)
and percents naturally can be obtained
with a factor of 100.
Some care will often be needed over
missing values:
bysort type : gen typ_freq = _N if !missing(type)
su typ_freq, meanonly
replace typ_freq = typ_freq / r(sum)
Another way to do it:
egen typ_freq = sum(1) if !missing(type), by(type)
Nick
[email protected]
Matteo Foschi
>
> We have a little trouble with a, we think, easy task.
> We want to generate a variable, which contains the relative
> frequency of
> another variable values.
> We have a variable, say �type�, and want to built a new variable,
> say �typ_freq�, which shows the relatively frequency of
> each value of �type�.
>
> We have tried first with the tablepc ado-file:
> tablepc type, generate (typ_freq)
> Therefore we obtain only the relatively frequency of each
> observation.
>
> We can obtain - in alternatively - the cumulated frequency
> (variable freqcum)
> with a little program:
>
> Generate freqcum =.
> ..
> sort typ_freq
> by typ_freq: gen groups = 1 if _n ==1
> replace groups = sum(groups)
> ..
> local K = groups[_N]
> local i 1
> while `i' <= `K' {
> replace freqcum = sum(typ_freq)
> local i = `i' + 1
> }
> ..
> We are not able, however, to recode freq_cum or typ_freq
> in order to obtain
> the relatively frequency of each value of �type�, as bottom shows:
>
> Obs Type obs_freq freq_cum typ_freq
> 1 1 1 1 2
> 2 1 1 2 2
> 3 2 1 3 3
> 4 2 1 4 3
> 5 2 1 5 3
> 6 3 1 6 1
> 7 4 1 7 2
> 8 4 1 8 2
> 9 5 1 9 1
> 10 6 1 10 1
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/