<>
" was not optimal it created a dataset of means,
and not counts at the school level (unless I was doing something
incorrect....)"
Just to maintain fairness w.r.t. -collapse-, it can produce all kinds of
statistics, as you can see from its help file, not just means. Glad my
solution worked out for you, though :-)
HTH
Martin
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of [email protected]
Sent: Montag, 15. Februar 2010 20:05
To: [email protected]
Subject: st: Re: Constructing a group level variable
Thanks to Nick and Martin for their replies. Suggestions I received
and their results are:
1) Use -collapse - : was not optimal it created a dataset of means,
and not counts at the school level (unless I was doing something
incorrect....)
2) use - contract - :
. contract timepub08 timefin08 schid
. sort schid
. l, sepby(schid)
+-------------------------------------+
| schid timep~08 timef~08 _freq |
|-------------------------------------|
1. | 2 2 2 15 |
2. | 2 . . 8 |
3. | 2 2 1 4 |
4. | 2 2 3 1 |
5. | 2 3 3 5 |
|-------------------------------------|
6. | 4 . . 4 |
7. | 4 1 3 1 |
8. | 4 3 1 8 |
again, this was not precisely what I was looking for.
3) using reshape:
reshape wide time*, i(schid) j(studid)
forv i=1/3{
egen byte timep`i' = anycount(timepub0*), values(`i')
egen byte timef`i' = anycount(timefin0*), values(`i')
}
drop timepub0* timefin0*
order schid timep* timef*
l, noo
Note: I had to make minor changes in the code (to correct the varnames).
This worked like a charm! Although repeating it on my large dataset
will take quite a bit of time :-(
Data long -> wide
----------------------------------------------------------------------------
-
Number of obs. 3530 -> 48
Number of variables 4 -> 7061
j variable (3530 values) studid -> (dropped)
xij variables:
timepub08 -> timepub083779
timepub083780 ... timepub087567
timefin08 -> timefin083779
timefin083780 ... timefin087567
----------------------------------------------------------------------------
-
.
. forv i=1/3{
2. egen byte timep`i' = anycount(timepub0*), values(`i')
3. egen byte timef`i' = anycount(timefin0*), values(`i')
4. }
.
. drop timepub0* timefin0*
. order schid timep* timef*
.
. l, noo
+-------------------------------------------------------------+
| schid timep1 timep2 timep3 timef1 timef2 timef3 |
|-------------------------------------------------------------|
| 2 0 20 5 4 15 6 |
| 4 19 70 17 37 60 3 |
| 6 19 65 4 43 42 3 |
| 7 10 60 20 35 46 7 |
| 8 24 79 7 47 59 6 |
|-------------------------------------------------------------|
| 10 15 61 15 35 52 3 |
| 11 4 38 1 10 30 2 |
| 12 0 35 5 14 26 0 |
| 16 30 26 2 38 16 2 |
| 18 5 53 19 31 40 3 |
|-------------------------------------------------------------|
| 20 17 53 6 31 41 3 |
| 27 2 35 8 16 26 2 |
| 28 8 59 17 19 55 9 |
| 32 4 42 14 27 27 5 |
| 33 0 23 11 6 23 4 |
|-------------------------------------------------------------|
| 34 7 60 1 14 51 2 |
| 36 20 33 2 29 22 3 |
| 38 8 68 18 59 32 2 |
| 40 10 18 1 20 10 0 |
| 42 44 95 15 57 79 13 |
|-------------------------------------------------------------|
Many thanks!
***************************************************************************
This is "so not elegant" :-(
*************
clear*
input byte schid studid byte timep08 byte timef08
2 6910 2 2
2 6911 2 2
2 6912 2 3
2 6913 3 3
4 7299 2 2
4 7300 2 2
4 7301 3 1
4 7302 2 2
4 7303 2 2
4 7304 2 1
4 7305 1 .
end
reshape wide time*, i(schid) j(studid)
forv i=1/3{
egen byte timep`i' = anycount(timep0*), values(`i')
egen byte timef`i' = anycount(timef0*), values(`i')
}
drop timep0* timef0*
order schid timep* timef*
l, noo
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von
[email protected]
Gesendet: Mittwoch, 10. Februar 2010 17:25
An: [email protected]
Cc: [email protected]
Betreff: st: Constructing a group level variable
Hi all,
I have a dataset that consists of students (studid) in 49 schools
(schid) responding to a survey. They were asked their impressions of the
curriculum ("do you believe time devoted to subject xxx was ....") and
all responses were categorical (with 1 denoting 'not enough', 2 denoting
'just right', and 3 being 'too much'). A slice of the data is:
list schid studid timepub08 timefin08 in 30/40
+--------------------------------------+
schid studid timep~08 timef~08
--------------------------------------
30. 2 6910 2 2
31. 2 6911 2 2
32. 2 6912 2 3
33. 2 6913 3 3
34. 4 7299 2 2
--------------------------------------
35. 4 7300 2 2
36. 4 7301 3 1
37. 4 7302 2 2
38. 4 7303 2 2
39. 4 7304 2 1
--------------------------------------
40. 4 7305 1 .
+--------------------------------------+
Question: Is there a way to generate a (or collapse this) dataset to get
school levels variables? I am interested in school level variables that
captures the number of responses to each category (1 'not enough' 2
'just right' and 3 'too much') for each question (timepub08 timefin08).
Many thanks for the advice.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/