Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Miguel Angel Duran Munoz" <maduran@uma.es> |
To | statalist@hsphsun2.harvard.edu |
Subject | RE: st: Count observations |
Date | Tue, 23 Jul 2013 20:10:11 +0200 |
Thank you very much for your help. Actually I was interested in the # of firms that have either A or B. I have used your suggestion: ******* start example tab id if fee=="A" | fee=="B" di r(r) ******* end example Although I got a message stating that there are too many observations, what you have suggested has helped me to find out -inspect- and -codebook-. -inspect- does not work either, because there are too many unique observations, but -codebook- works. Just in case this could be helpful for anyone, this is what I have done, codebook id if fee=="A" | fee=="B" Miguel. > I am now unclear what you want to count.... so I try few things. > > The following counts the number of times either A or B appears in your > variable _over all ids_ (I am following your statement that "A" and "B" > are the "same" or "equivalent") > > ******* start example > count if fee=="A" | fee=="B" > ******* end example > > This does it _within each id_ (you had worked it out yourself) > > ******* start example > bysort id : count if fee=="A" | fee=="B" > ******* end example > > There are other things you can do. For instance, # of firms with either A > or B, # of firms with both A and B, et cetera. In your second email you > appear to be interested in the # of firms that have either A or B. This > can be done by: > > ******* start example > tab id if fee=="A" | fee=="B" > di r(r) > ******* end example > > while this counts the number of firms that have both "A" and "B" (but this > crucially assumes that both "A" and "B", if they appear, cannot appear > more than once. If either "A" or "B" can appear more than once by id, it > does not work) > > ******* start example > gen touse = (fee=="A" | fee=="B") > bysort id : egen total = total(touse) > tab id if total==2 > di r(r) > ******* end example > > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Miguel Angel > Duran Munoz > Sent: Tuesday, July 23, 2013 12:07 PM > To: statalist@hsphsun2.harvard.edu > Subject: RE: st: Count observations > > Thank you very much for your help. Let me explain a bit more why -count- > did not work. There is something in my variables that I did not make > explicit in my first message (I thought could solve it on my own after > being helped, but it is not the case). > > As I told you, the variable fee describes the type of fee (eg, A B C). > Nevertheless, the dataset is constructed in a way that A and B, for > instance, are the same (specifically, I have "commitment fee" and > "commitment regular fee", but both types are the same). But, although A > and B are the same, they both might be included for the same firm. > > Therefore, given this illustrative dataset, > > Id Type-of-fee > > 1 A > 1 B > 1 C > 2 C > 2 A > 3 A > 4 B > 4 . > 4 A > > there are 4 firms that have either A or B. I was trying to use this, > -bysort id: count if fee=="A" | fee=="B", but what I get is (obsviously) > split by firms. > > I am sorry for the initial confusion. > > Miguel. > > Unclear why it does not work. It works with the following: >> >> ******* start example >> clear all >> input id >> 1 >> 1 >> 1 >> 2 >> 2 >> 3 >> 4 >> 4 >> 4 >> end >> input str2 fee >> A >> B >> C >> C >> A >> A >> B >> "" >> A >> count if fee=="A" >> ******* end example >> >> Notice that another alternative is -tab fee- >> >> -----Original Message----- >> From: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Miguel >> Angel Duran Munoz >> Sent: Tuesday, July 23, 2013 10:51 AM >> To: statalist@hsphsun2.harvard.edu >> Subject: Re: st: Count observations >> >> Hi, Statalisters. I have the following doubt. My dataset is arranged >> in the following way. I have a variable that identifies firms (say id). >> Another variable describes whether different types of fees (eg, A B C) >> applies to a firm. Accordingly, the dataset looks similar to >> >> Id Type-of-fee >> >> 1 A >> 1 B >> 1 C >> 2 C >> 2 A >> 3 A >> 4 B >> 4 . >> 4 A >> >> I would like to know, for instance, the number of A fees that there >> are. I have used -count- but I am not able to get what I want. Will >> you please help me? >> >> Thanks in advance. >> >> Miguel. >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ >> > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/