Thanks for the further information.
Add support for -if- and -in- to your -olindicator- and then use the
-group()- idea mentioned earlier.
Nick
[email protected]
Ronnie Babigumira
Dear Eva and Nick
Thanks for the quick response. Sorry for the unclear email, I was trying
keep the email short.
Since sending my first email, I have made some progress but I still
welcome help from you or anyone
else.
By way of a brief background, I am running this on a number of data sets
with the aim of listing the
suspect cases in a way that allows us to easily go back to the
questionnaires to check that the
listed value is indeed a valid entry. I would therefore want to be able
to see the actual values of
fup_pdt and fup_unit
The full code I am working with is
u qtr_b_fup, clear
levelsof qtr, local(qtrs)
foreach qtr of local qtrs {
use qtr_b_fup, clear
keep if qtr == `qtr'
levelsof fup_pdt, local(fupdts)
foreach i of local fupdts {
levelsof fup_unit if fup_pdt == `i', local(fupunits)
foreach j of local fupunits {
preserve
keep if fup_pdt == `i' & fup_unit == `j'
olindicator fup_qtycoll
di "Potential errors in the quantity
collected for `i' & unit`j'"
list houscode qtr fup_pdt fup_unit
fup_qtycoll if fup_qtycoll_ol==1
restore
}
}
}
Where -olindicator- is a small program I have written to help me
identify the outliers
With output that looks like this
------------------------------------------------------------------------
----------------
Potential errors in the quantity collected for 1 & unit 34
+---------------------------------------------------+
| houscode | qtr | fup_pdt | fup_unit | fup_qtycoll |
|----------+-----+---------+----------+-------------|
| 138 | 1 | 1 | 34 | 50 |
+---------------------------------------------------+
------------------------------------------------------------------------
----------------
I think this can be improved and that I don't have to keep reloading the
data so, I welcome any help.
Hope this makes things clearer.
Ronnie
Eva Poen wrote:
> Ronnie,
>
> this is difficult to answer without knowing what you mean by "run my
> checks". There are some tools out there to detect outliers; use
> -findit outliers- to see what's around.
>
> Whether or not you need nested loops will depend on whether or not
> your checks need to know the actual value of fup_pdt and fup_unit. The
> -preserve- and -keep- thing will slow things down, and you may be able
> to do without it (by just using if conditions in your code).
>
> But, really, we need some more information to be able to help.
>
> Eva
>
>
> 2009/4/13 Ronnie Babigumira <[email protected]>:
>> Dear list
>> I have quarterly data that looks like this
>>
>> qtr houscode fup_pdt fup_unit fup_qtycoll
>> 1 562 23 2 50
>> 1 570 628 2 2
>> 1 573 628 201 10
>> 1 573 628 2 2
>> 1 576 628 201 5
>> 1 576 628 201 20
>> 1 577 628 2 1
>> 1 578 628 2 1
>> 1 590 34 26 60
>> 1 595 34 26 200
>>
>> For each quarter, I would like to identify "strange" values
(outliers) in
>> the variable fup_qtycoll
>> (simply to rule out data entry error).
>>
>> This would be done for the different fup_pdt and fup_unit
combinations
>>
>> My initial idea is that I would have to do it in three -foreach-
loops, be
>> something along the lines
>> (this does not work since I need to -preserve- before -keep-ing,
which I
>> would like to avoid, and it
>> is not by quarter yet)
>>
>> levelsof fup_pdt, local(fupdts)
>> foreach i of local fupdts {
>> keep if fup_pdt == `i'
>> levelsof fup_unit, local(fupunits)
>> foreach j of local j {
>> keep if fup_unit == `j'
>> *** run my checks and other stuff
>> }
>> }
>>
>> I would appreciate some help on how I can do this efficiently
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/