Maarten posted some example code to help the original poster.
I made a couple of additions to Maarten's code to try to aid my
understanding, and I just became more confused.
I have not changed any of his lines except the final -list- (and I
commented out the -drop-). I have added a few lines to generate
variables -- and I don't understand the values of these variables.
The modified code:
*---------- begin example (modified by timbp) --------------
clear
input ///
age sales ind year id
2 1.04 3339 1991 1
3 1.75 3339 1991 2
3 3.08 3339 1991 3
31 .496 3339 1991 4
42 .546 3339 1991 5
42 1.5 3339 1991 6
5 . 3411 1991 7
8 .584 3411 1991 8
30 .491 3411 1991 9
19 .944 3411 1991 10
20 .692 3411 1991 11
28 1.81 3411 1991 12
29 .601 3411 1991 13
32 .509 3411 1991 14
42 .938 3411 1991 15
42 .886 3411 1991 16
end
gen byte miss = missing(age, sales, ind, year)
bysort miss ind year (sales): gen long n = ///
_N - 1 if miss == 0
bysort miss ind year (sales): gen medage = ///
(age[`= floor(n/2)'] + age[`= ceil(n/2)'])/2 ///
if miss == 0
bysort miss ind year (sales): gen agef = age[`= floor(n/2)'] if miss ==
0 // added by timbp
bysort miss ind year (sales): gen agec = age[`= ceil(n/2)'] if miss == 0
// added by timbp
bysort miss ind year (sales): gen fl = floor(n/2) if miss == 0 // added
by timbp
bysort miss ind year (sales): gen ce = ceil(n/2) if miss == 0 // added
by timbp
* drop miss n
list, sepby(miss ind) //modified by timbp
*--------------- end example ----------------------
The output:
+----------------------------------------------------------------------------+
| age sales ind year id miss n medage agef
agec fl ce |
|----------------------------------------------------------------------------|
1. | 31 .496 3339 1991 4 0 5 22 42
2 2 3 |
2. | 42 .546 3339 1991 5 0 5 22 42
2 2 3 |
3. | 2 1.04 3339 1991 1 0 5 22 42
2 2 3 |
4. | 42 1.5 3339 1991 6 0 5 22 42
2 2 3 |
5. | 3 1.75 3339 1991 2 0 5 22 42
2 2 3 |
6. | 3 3.08 3339 1991 3 0 5 22 42
2 2 3 |
|----------------------------------------------------------------------------|
7. | 30 .491 3411 1991 9 0 8 20 32
8 4 4 |
8. | 32 .509 3411 1991 14 0 8 20 32
8 4 4 |
9. | 8 .584 3411 1991 8 0 8 20 32
8 4 4 |
10. | 29 .601 3411 1991 13 0 8 20 32
8 4 4 |
11. | 20 .692 3411 1991 11 0 8 20 32
8 4 4 |
12. | 42 .886 3411 1991 16 0 8 20 32
8 4 4 |
13. | 42 .938 3411 1991 15 0 8 20 32
8 4 4 |
14. | 19 .944 3411 1991 10 0 8 20 32
8 4 4 |
15. | 28 1.81 3411 1991 12 0 8 20 32
8 4 4 |
|----------------------------------------------------------------------------|
16. | 5 . 3411 1991 7 1 . . .
. . . |
+----------------------------------------------------------------------------+
My questions:
1. For ind==3411, fl and ce are both 4, so why are agef and agec different?
2. For ind==3411, medage appears to be average of age[2] and age[3] ([ ]
numbers relating to the by group).
How does Stata get those index values when fl==4 and ce==4?
3. For ind==3339, fl==2 and ce==3, and medage appears to be average of
age[2] and age[3], but for ind==3411, fl==4 and ce==4 and medage appears
to be average of age[2] and age[3]. Why the difference?
Thanks,
tim
[email protected]
Maarten buis wrote:
--- On Tue, 18/8/09, John Hund wrote:
As an example, I have data (sales and ages) on firms by
year in different industries. I would like to find the
age of the firm with the median value on sales for each
year and industry.
*---------- begin example --------------
clear
input ///
age sales ind year id
2 1.04 3339 1991 1
3 1.75 3339 1991 2
3 3.08 3339 1991 3
31 .496 3339 1991 4
42 .546 3339 1991 5
42 1.5 3339 1991 6
5 . 3411 1991 7
8 .584 3411 1991 8
30 .491 3411 1991 9
19 .944 3411 1991 10
20 .692 3411 1991 11
28 1.81 3411 1991 12
29 .601 3411 1991 13
32 .509 3411 1991 14
42 .938 3411 1991 15
42 .886 3411 1991 16
end
gen byte miss = missing(age, sales, ind, year)
bysort miss ind year (sales): gen long n = ///
_N - 1 if miss == 0
bysort miss ind year (sales): gen medage = ///
(age[`= floor(n/2)'] + age[`= ceil(n/2)'])/2 ///
if miss == 0
drop miss n
list
*--------------- end example ----------------------
Hope this helps,
Maarten
-----------------------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/