I do not feel confused, but I did not grasp that
that was what you wanted. I can't see a simpler
way than this. For the benefit of any watching,
the -egen- function -nvals()- comes from -egenmore-
on SSC. A footnote gives code using official Stata
only.
. rename psic Kpsic
. rename ssic Kssic
. reshape long K , string i(Gvkey year subno)
(note: j = psic ssic)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 6 -> 12
Number of variables 5 -> 5
j variable (2 values) -> _j
xij variables:
Kpsic Kssic -> K
-----------------------------------------------------------------------------
. l
+------------------------------------+
| Gvkey year subno _j K |
|------------------------------------|
1. | 1223 1999 1 psic 4767 |
2. | 1223 1999 1 ssic 4743 |
3. | 1223 1999 2 psic 4767 |
4. | 1223 1999 2 ssic 4763 |
5. | 1223 1999 3 psic 4757 |
|------------------------------------|
6. | 1223 1999 3 ssic 4767 |
7. | 1223 1999 4 psic 4767 |
8. | 1223 1999 4 ssic 4753 |
9. | 1223 1999 5 psic 4777 |
10. | 1223 1999 5 ssic 4787 |
|------------------------------------|
11. | 1223 1999 6 psic 4767 |
12. | 1223 1999 6 ssic 4743 |
+------------------------------------+
. egen nvals = nvals(K), by(Gvkey year)
. l
+--------------------------------------------+
| Gvkey year subno _j K nvals |
|--------------------------------------------|
1. | 1223 1999 1 psic 4767 7 |
2. | 1223 1999 1 ssic 4743 7 |
3. | 1223 1999 2 psic 4767 7 |
4. | 1223 1999 2 ssic 4763 7 |
5. | 1223 1999 3 psic 4757 7 |
|--------------------------------------------|
6. | 1223 1999 3 ssic 4767 7 |
7. | 1223 1999 4 psic 4767 7 |
8. | 1223 1999 4 ssic 4753 7 |
9. | 1223 1999 5 psic 4777 7 |
10. | 1223 1999 5 ssic 4787 7 |
|--------------------------------------------|
11. | 1223 1999 6 psic 4767 7 |
12. | 1223 1999 6 ssic 4743 7 |
+--------------------------------------------+
. reshape wide
(note: j = psic ssic)
Data long -> wide
-----------------------------------------------------------------------------
Number of obs. 12 -> 6
Number of variables 6 -> 6
j variable (2 values) _j -> (dropped)
xij variables:
K -> Kpsic Kssic
-----------------------------------------------------------------------------
. renpfix K
. l
+--------------------------------------------+
| Gvkey year subno psic ssic nvals |
|--------------------------------------------|
1. | 1223 1999 1 4767 4743 7 |
2. | 1223 1999 2 4767 4763 7 |
3. | 1223 1999 3 4757 4767 7 |
4. | 1223 1999 4 4767 4753 7 |
5. | 1223 1999 5 4777 4787 7 |
|--------------------------------------------|
6. | 1223 1999 6 4767 4743 7 |
+--------------------------------------------+
Perhaps we should add this example to the webpage c
cited, by Gary Longton and myself.
Nick
[email protected]
rename psic Kpsic
rename ssic Kssic
reshape long K , string i(Gvkey year subno)
l
bysort Gvkey year K : gen nvals = _n == 1
by Gvkey year : replace nvals = sum(nvals)
by Gvkey year : replace nvals = nvals[_N]
sort Gvkey year subno
l
reshape wide
renpfix K
l
Wanli Zhao
> Thanks, Nick. I looked into the suggestions and I think I might have
> confused you on my problem. My panel data is like this:
> Gvkey psic ssic year subno
> 1223 4767 4743 1999 1
> 1223 4767 4763 1999 2
> 1223 4757 4767 1999 3
> 1223 4767 4753 1999 4
> 1223 4777 4787 1999 5
> 1223 4767 4743 1999 6
>
> Using command unique, I can count the distinct values of psic
> and ssic by
> gvkey by year. So for psic it's 3 and for ssic it's 5. what I
> want is to
> count the distinct values of both psic and ssic by gvkey by
> year. In this
> case, it's 7 (4767, 4757, 4777, 4743, 4763, 4753, 4787). How
> to generate a
> new variable for my purpose? Hope I'm clear now. Pls help.
Nick Cox
> By "unique" here I think you mean "distinct".
>
> Try -groups- from SSC. Or -egen, group()- and then tabulate.
Wanli Zhao
> > I have a simple question but got stuck on a simple solution.
> > I have a panel
> > and let's say cross-section id is gvkey and time id is year.
> > There are two
> > variables, say, primary sic and secondary sic. My aim to count the
> > unique value of sic in both variables by gvkey by year. I know the
> > 'by' thing is straightforward but is there a quick solution
> to count
> > the unique observation in both variables? I know the
> commands such as
> > unique, distinct and egenmore nvals. They work perfect for a single
> > variable.
> > Also, on the
> > webpage there is a explanation of the unique combination of two
> > variables and how to count that. I guess mine is different.
> Your help
> > is appreciated.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/