RE: st: Count observations

From   "Miguel Angel Duran Munoz" <>
Subject   RE: st: Count observations
Date   Tue, 23 Jul 2013 20:10:11 +0200

Thank you very much for your help.

Actually I was interested in the # of firms that have either A or B. I
have used your suggestion:

******* start example
tab id if fee=="A" | fee=="B"
di r(r)
******* end example

Although I got a message stating that there are too many observations,
what you have suggested has helped me to find out -inspect- and
-codebook-. -inspect- does not work either, because there are too many
unique observations, but -codebook- works.

Just in case this could be helpful for anyone, this is what I have done,

codebook id if fee=="A" | fee=="B"


> I am now unclear what you want to count.... so I try few things.
> The following counts the number of times either A or B  appears in your
> variable _over all ids_ (I am following your statement that "A" and "B"
> are the "same" or "equivalent")
> ******* start example
> count if fee=="A" | fee=="B"
> ******* end example
> This does it _within each id_ (you had worked it out yourself)
> ******* start example
> bysort id : count if fee=="A" | fee=="B"
> ******* end example
> There are other things you can do. For instance, # of firms with either A
> or B, # of firms with both A and B, et cetera. In your second email you
> appear to be interested in the # of firms that have either A or B.  This
> can be done by:
> ******* start example
> tab id if fee=="A" | fee=="B"
> di r(r)
> ******* end example
> while this counts the number of firms that have both "A" and "B" (but this
> crucially assumes that both "A" and "B", if they appear, cannot appear
> more than once. If either "A" or "B" can appear more than once by id, it
> does not work)
> ******* start example
> gen touse = (fee=="A" | fee=="B")
> bysort id : egen total = total(touse)
> tab id if total==2
> di r(r)
> ******* end example
> -----Original Message-----
> From:
> [] On Behalf Of Miguel Angel
> Duran Munoz
> Sent: Tuesday, July 23, 2013 12:07 PM
> To:
> Subject: RE: st: Count observations
> Thank you very much for your help. Let me explain a bit more why -count-
> did not work. There is something in my variables that I did not make
> explicit in my first message (I thought could solve it on my own after
> being helped, but it is not the case).
> As I told you, the variable fee describes the type of fee (eg, A B C).
> Nevertheless, the dataset is constructed in a way that A and B, for
> instance, are the same (specifically, I have "commitment fee" and
> "commitment regular fee", but both types are the same). But, although A
> and B are the same, they both might be included for the same firm.
> Therefore, given this illustrative dataset,
> Id    Type-of-fee
> 1          A
> 1          B
> 1          C
> 2          C
> 2          A
> 3          A
> 4          B
> 4          .
> 4          A
> there are 4 firms that have either A or B. I was trying to use this,
> -bysort id: count if fee=="A" | fee=="B", but what I get is (obsviously)
> split by firms.
> I am sorry for the initial confusion.
> Miguel.
>  Unclear why it does not work. It works with the following:
>> ******* start example
>> clear all
>> input id
>> 1
>> 1
>> 1
>> 2
>> 2
>> 3
>> 4
>> 4
>> 4
>> end
>> input str2 fee
>> A
>> B
>> C
>> C
>> A
>> A
>> B
>> ""
>> A
>> count if fee=="A"
>> ******* end example
>> Notice that another alternative is -tab fee-
>> -----Original Message-----
>> From:
>> [] On Behalf Of Miguel
>> Angel Duran Munoz
>> Sent: Tuesday, July 23, 2013 10:51 AM
>> To:
>> Subject: Re: st: Count observations
>> Hi, Statalisters. I have the following doubt. My dataset is arranged
>> in the following way. I have a variable that identifies firms (say id).
>> Another variable describes whether different types of fees (eg, A B C)
>> applies to a firm. Accordingly, the dataset looks similar to
>> Id    Type-of-fee
>> 1          A
>> 1          B
>> 1          C
>> 2          C
>> 2          A
>> 3          A
>> 4          B
>> 4          .
>> 4          A
>> I would like to know, for instance, the number of A fees that there
>> are. I have used -count- but I am not able to get what I want. Will
>> you please help me?
>> Thanks in advance.
>> Miguel.
