It is difficult to advise how to do something unspecified in an
efficient way.
Nor do I see why you think you need to throw out most of the data at
each step.
But if the idea is to check each cross-combination of variables, then
-egen group = group(<varlist>)- creates a single grouping variable.
Nick
[email protected]
Ronnie Babigumira
I have quarterly data that looks like this
qtr houscode fup_pdt fup_unit fup_qtycoll
1 562 23 2 50
1 570 628 2 2
1 573 628 201 10
1 573 628 2 2
1 576 628 201 5
1 576 628 201 20
1 577 628 2 1
1 578 628 2 1
1 590 34 26 60
1 595 34 26 200
For each quarter, I would like to identify "strange" values (outliers)
in the variable fup_qtycoll
(simply to rule out data entry error).
This would be done for the different fup_pdt and fup_unit combinations
My initial idea is that I would have to do it in three -foreach- loops,
be something along the lines
(this does not work since I need to -preserve- before -keep-ing, which I
would like to avoid, and it
is not by quarter yet)
levelsof fup_pdt, local(fupdts)
foreach i of local fupdts {
keep if fup_pdt == `i'
levelsof fup_unit, local(fupunits)
foreach j of local j {
keep if fup_unit == `j'
*** run my checks and other stuff
}
}
I would appreciate some help on how I can do this efficiently
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/