On 12/9/06, Lucy Shum <[email protected]> wrote:
Hi, Thanks for the help. Could sb expand a little on the egen ... =
cut(age), group(4) command? I'm not sure how to interpret this in "English".
-man egen- explains it in english.
Further, I am also stumped about the first line where it says:
.. keep if _n==_N (so it's saying to keep the current patient observation (2
entries per patient in the Catheter.dta file) as long as it is the same as
the total number of observations in the dataset? - doesn't make sense to me.
That wouldn't make sense at all, since you would end up with only one
observation.
However in this example
. bysort patient (time): keep if _n == _N
the -keep- is prefixed by -bysort- so it is applied to each group
formed by the variable patient. If there are five observations on
patient one, and four on patient two then only the last time each was
seen is retained by the -keep- command. The (time) in the above
statement sorts within the patient group in ascending order...
To demonstrate how this works consider the data set below
. list
+----------------+
| patient time |
|----------------|
1. | 1 3 |
2. | 1 4 |
3. | 1 1 |
4. | 1 2 |
5. | 1 5 |
|----------------|
6. | 2 4 |
7. | 2 2 |
8. | 2 3 |
9. | 2 1 |
10. | 3 6 |
|----------------|
11. | 3 1 |
12. | 3 5 |
13. | 3 2 |
14. | 3 4 |
15. | 3 3 |
|----------------|
16. | 3 7 |
+----------------+
* Sorting the data does this...
. sort patient time
. list
+----------------+
| patient time |
|----------------|
1. | 1 1 |
2. | 1 2 |
3. | 1 3 |
4. | 1 4 |
5. | 1 5 |
|----------------|
6. | 2 1 |
7. | 2 2 |
8. | 2 3 |
9. | 2 4 |
10. | 3 1 |
|----------------|
11. | 3 2 |
12. | 3 3 |
13. | 3 4 |
14. | 3 5 |
15. | 3 6 |
|----------------|
16. | 3 7 |
+----------------+
* Generate an indicator of which observation comes in which order
based on patient and time
. bysort patient (time) : gen _ = _n
. list
+--------------------+
| patient time _ |
|--------------------|
1. | 1 1 1 |
2. | 1 2 2 |
3. | 1 3 3 |
4. | 1 4 4 |
5. | 1 5 5 |
|--------------------|
6. | 2 1 1 |
7. | 2 2 2 |
8. | 2 3 3 |
9. | 2 4 4 |
10. | 3 1 1 |
|--------------------|
11. | 3 2 2 |
12. | 3 3 3 |
13. | 3 4 4 |
14. | 3 5 5 |
15. | 3 6 6 |
|--------------------|
16. | 3 7 7 |
+--------------------+
* Generate and indicator of how many observations there are on each patient
. bysort patient (time) : gen __ = _N
. list
+-------------------------+
| patient time _ __ |
|-------------------------|
1. | 1 1 1 5 |
2. | 1 2 2 5 |
3. | 1 3 3 5 |
4. | 1 4 4 5 |
5. | 1 5 5 5 |
|-------------------------|
6. | 2 1 1 4 |
7. | 2 2 2 4 |
8. | 2 3 3 4 |
9. | 2 4 4 4 |
10. | 3 1 1 7 |
|-------------------------|
11. | 3 2 2 7 |
12. | 3 3 3 7 |
13. | 3 4 4 7 |
14. | 3 5 5 7 |
15. | 3 6 6 7 |
|-------------------------|
16. | 3 7 7 7 |
+-------------------------+
* Retain the last observation on each patient based on time (which
will be where the value of _ is equal to __
. keep if _ == __
(13 observations deleted)
. list
+-------------------------+
| patient time _ __ |
|-------------------------|
1. | 1 5 5 5 |
2. | 2 4 4 4 |
3. | 3 7 7 7 |
+-------------------------+
Stata is just doing all of this on the fly for you by using the
original command.
HTH's
Neil
--
"Doing science for the money is like having sex for the exercise." - Matt
Email - [email protected] / [email protected]
Website - http://slack.ser.man.ac.uk/
Photos - http://www.flickr.com/photos/slackline/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/