Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: flagging significant values in a variable
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: flagging significant values in a variable
Date
Sat, 3 Mar 2012 17:39:26 +0000
On Sat, Mar 3, 2012 at 5:35 PM, Nick Cox <[email protected]> wrote:
> Partha:
>
> Threading is what your mailer does. It's not inherent in Statalist's
> operation. (Archiving is a separate matter.)
>
> But that's of no consequence: we all overlook previous emails from
> time to time.
>
> On the question of introducing a tolerance to the question, I stand by
> my earlier comment.
>
> If I understand you correctly the code you posted earlier was really
> intended as a code sketch and readers were expected to be perceptive
> enough to realise that. That really wasn't clear to me. I would be
> surprised if it was clear to anybody else But let's concentrate on the
> code and spell out what your approach implies, a loop over
> observations.
>
> forval i = 1/`=_N'
> if lci[`i'] - natlci[`i'] >`tol' {
> replace tag=1 in `i'
> }
> else if lci[`i']-natlci[`i'] < -`tol' {
> replace tag= 2 in `i'
> }
> else {
> replace tag = 0 in `i'
> }
> }
>
> However, this loop really isn't necessary as the whole thing can be
> done in one line.
>
> replace tag = cond(lci - natlci >`tol', 1, 2 * (lci[`i']-natlci[`i'] < -`tol'))
>
> If that's over-compressed, there is a shorter version in about three
> lines, similar in spirit to Graham's posting yesterday. That will be
> much faster.
>
> If you want to prefer a loop over observations here, that's your prerogative.
>
> (To concentrate on one specific code question, I have left in your tolerance.)
>
> Nick
>
> On Sat, Mar 3, 2012 at 3:20 PM, Partho Sarkar <[email protected]> wrote:
>> True, I had overlooked the earlier solutions- because the question
>> appears on 2 separate threads . I answered the one w/o any answers at
>> the time, w/o having seen the other thread. ( Btw this raises some
>> issues about duplicate threads, possibly unintendedly so, which often
>> confuse! )
>>
>> The code was just a sketch of an idea- I assumed (mistakenly perhaps)
>> that the user would realize the need to qualify the if loops in
>> practice (start off the loops with a foreach statement to loop through
>> all the observations) . The tolerance given is also only an example!
>> All this based on what I think is a legitimate interpretation of the
>> original question!
>>
>> Partho
>>
>> On Sat, Mar 3, 2012 at 8:22 PM, Nick Cox <[email protected]> wrote:
>>> This post overlooks earlier solutions posted yesterday. I see no need
>>> to complicate anything by introduction of a tolerance, which seems
>>> based on an idea that the rates are exact decimals to 4 d.p.
>>>
>>> Also, the code won't work as intended because it confuses the -if-
>>> command and the -if- qualifier.
>>>
>>> FAQ . . . . . . . . . . . . . . . . . . . . . if command vs. if qualifier
>>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Wernow
>>> 6/00 I have an if command in my program that only seems
>>> to evaluate the first observation, what's going on?
>>> http://www.stata.com/support/faqs/lang/ifqualifier.html
>>>
>>> Nick
>>>
>>> On Sat, Mar 3, 2012 at 2:10 PM, Partho Sarkar <[email protected]> wrote:
>>>> Tim,
>>>>
>>>> I am afraid you haven't spelt it out very clearly! Based on one
>>>> possible interpretation, this would be one way to do it (shown only
>>>> for the LCI (renamed lci) variable):
>>>>
>>>> ---------------------------START CODE-------------------------------------------
>>>>
>>>> egen natlci=total(lci*(region==99)) // generates a value for each
>>>> obs., equal to national value)
>>>> local tol .0001 // define tolerance for "significantly lower or higher"
>>>> gen byte tag= .
>>>> if lci-natlci>`tol' {
>>>> replace tag=1
>>>> }
>>>> else if lci-natlci< -`tol' {
>>>> replace tag= 2
>>>> }
>>>> else {
>>>> replace tag = 0
>>>> }
>>>>
>>>> ---------------------------END CODE-------------------------------------------
>>>>
>>>> Hope this helps
>>>>
>>>> Partho
>>>>
>>>> From Tim Evans <[email protected]>
>>>> To "'[email protected]'" <[email protected]>
>>>> Subject st: flagging significant values in a variable
>>>> Date Fri, 2 Mar 2012 09:24:46 +0000
>>>>
>>>> Hi,
>>>>
>>>> I have a dataset that has variables of rates, LCI and UCI for a
>>>> number of regions in addition to a national average (rate, LCI, UCI)
>>>> so that it looks like this:
>>>>
>>>> rate LCI UCI region
>>>> 0.9727 0.9583 0.9849 1
>>>> 0.9713 0.9523 0.9867 2
>>>> 0.9835 0.9667 0.9971 3
>>>> 0.9790 0.9741 0.9836 99
>>>>
>>>> What I would like to do is generate a flag beside each row that
>>>> will flag up entries where they are significantly higher (1) or lower
>>>> (2) or not significantly different (0) to region 99 - I'm unsure as to
>>>> the code here and would appreciate any advice. I'm using Stata 11.2.
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/