Dear Eva and Nick
Thanks for the quick response. Sorry for the unclear email, I was trying keep the email short.
Since sending my first email, I have made some progress but I still welcome help from you or anyone
else.
By way of a brief background, I am running this on a number of data sets with the aim of listing the
suspect cases in a way that allows us to easily go back to the questionnaires to check that the
listed value is indeed a valid entry. I would therefore want to be able to see the actual values of
fup_pdt and fup_unit
The full code I am working with is
u qtr_b_fup, clear
levelsof qtr, local(qtrs)
foreach qtr of local qtrs {
use qtr_b_fup, clear
keep if qtr == `qtr'
levelsof fup_pdt, local(fupdts)
foreach i of local fupdts {
levelsof fup_unit if fup_pdt == `i', local(fupunits)
foreach j of local fupunits {
preserve
keep if fup_pdt == `i' & fup_unit == `j'
olindicator fup_qtycoll
di "Potential errors in the quantity collected for `i' & unit`j'"
list houscode qtr fup_pdt fup_unit fup_qtycoll if fup_qtycoll_ol==1
restore
}
}
}
Where -olindicator- is a small program I have written to help me identify the outliers
With output that looks like this
----------------------------------------------------------------------------------------
Potential errors in the quantity collected for 1 & unit 34
+---------------------------------------------------+
| houscode | qtr | fup_pdt | fup_unit | fup_qtycoll |
|----------+-----+---------+----------+-------------|
| 138 | 1 | 1 | 34 | 50 |
+---------------------------------------------------+
----------------------------------------------------------------------------------------
I think this can be improved and that I don't have to keep reloading the data so, I welcome any help.
Hope this makes things clearer.
Ronnie
Eva Poen wrote:
Ronnie,
this is difficult to answer without knowing what you mean by "run my
checks". There are some tools out there to detect outliers; use
-findit outliers- to see what's around.
Whether or not you need nested loops will depend on whether or not
your checks need to know the actual value of fup_pdt and fup_unit. The
-preserve- and -keep- thing will slow things down, and you may be able
to do without it (by just using if conditions in your code).
But, really, we need some more information to be able to help.
Eva
2009/4/13 Ronnie Babigumira <[email protected]>:
Dear list
I have quarterly data that looks like this
qtr houscode fup_pdt fup_unit fup_qtycoll
1 562 23 2 50
1 570 628 2 2
1 573 628 201 10
1 573 628 2 2
1 576 628 201 5
1 576 628 201 20
1 577 628 2 1
1 578 628 2 1
1 590 34 26 60
1 595 34 26 200
For each quarter, I would like to identify "strange" values (outliers) in
the variable fup_qtycoll
(simply to rule out data entry error).
This would be done for the different fup_pdt and fup_unit combinations
My initial idea is that I would have to do it in three -foreach- loops, be
something along the lines
(this does not work since I need to -preserve- before -keep-ing, which I
would like to avoid, and it
is not by quarter yet)
levelsof fup_pdt, local(fupdts)
foreach i of local fupdts {
keep if fup_pdt == `i'
levelsof fup_unit, local(fupunits)
foreach j of local j {
keep if fup_unit == `j'
*** run my checks and other stuff
}
}
I would appreciate some help on how I can do this efficiently
Ronnie
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/