Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: flagging significant values in a variable
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: flagging significant values in a variable
Date
Sun, 4 Mar 2012 08:33:17 +0000
I read Tim's variables UCI and LCI as giving lower and upper
confidence intervals. It follows that significance or the lack of it
is to be determined by where values lie relative to those variables.
This is naturally a convention. The approximations concerned are
likely to be much grosser than can be expressed by a tolerance of 1e-4
for at least the following reasons: (1) comparisons of national
interval determined somehow with area intervals (2) unless Tim tells
us otherwise, no adjustment for any spatial dependence.
On threads: The Statalist archive maintained by StataCorp shows Graeme
MacLennan's and my reply to Tim Evans threaded together. See
http://www.stata.com/statalist/archive/2012-03/ (not Graham; another
name I got wrong yesterday)
The Harvard archive separates these threads. I don't know why. See
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1203/subject/>
I wouldn't jump to the conclusion that the software used by Harvard is
worse, as on occasion I've seen the StataCorp archive make the wrong
decision and Harvard make the right one.
Archive threading is often messed up because send replies that aren't
replies as far as their mailer is concerned or people start new
threads with replies to other messages, but there is no reason to
suppose that anyone departed from standard protocol in this case.
So, it comes down to the archiving software, which just makes the
wrong decision in a small fraction of cases.
As you are a member of Statalist, all Statalist posts are sent to you,
either individually or in daily digest form, depending on how you
subscribe.
Nick
On Sun, Mar 4, 2012 at 4:30 AM, Partho Sarkar <[email protected]> wrote:
> Fair enough, Nick. Especially appreciate your elegant one line cond()
> code- I have struggled with this useful but arcane function in the
> past! I still think, though, that the "tolerance" threshold is a
> natural interpretation of the phrase " *significantly* higher/lower"
> -it certainly came to my mind quite naturally!
>
> By the way, if I may harp on the side issue of duplicate threads once
> more, I (like many others, perhaps) don't get all Statalist posts
> mailed to me (unless they are in response to one of my own posts) but
> view the archive occasionally, and respond if I see something that I
> want to respond to. So my mailer is not really at fault here! This
> makes such "instances" of oversight of earlier replies much more
> likely if a question is posted twice on separate threads (that happens
> fairly often). I wonder if it would be feasible/cost-effective to
> put in place a mechanism to prevent this (beyond asking listers to
> observe care in posting).
>
> Regards
> Partho
>
> On Sat, Mar 3, 2012 at 11:05 PM, Nick Cox <[email protected]> wrote:
>> Threading is what your mailer does. It's not inherent in Statalist's
>> operation. (Archiving is a separate matter.)
>>
>> But that's of no consequence: we all overlook previous emails from
>> time to time.
>>
>> On the question of introducing a tolerance to the question, I stand by
>> my earlier comment.
>>
>> If I understand you correctly the code you posted earlier was really
>> intended as a code sketch and readers were expected to be perceptive
>> enough to realise that. That really wasn't clear to me. I would be
>> surprised if it was clear to anybody else But let's concentrate on the
>> code and spell out what your approach implies, a loop over
>> observations.
>>
>> forval i = 1/`=_N'
>> if lci[`i'] - natlci[`i'] >`tol' {
>> replace tag=1 in `i'
>> }
>> else if lci[`i']-natlci[`i'] < -`tol' {
>> replace tag= 2 in `i'
>> }
>> else {
>> replace tag = 0 in `i'
>> }
>> }
>>
>> However, this loop really isn't necessary as the whole thing can be
>> done in one line.
>>
>> replace tag = cond(lci - natlci >`tol', 1, 2 * (lci[`i']-natlci[`i'] < -`tol'))
>>
>> If that's over-compressed, there is a shorter version in about three
>> lines, similar in spirit to Graham's posting yesterday. That will be
>> much faster.
>>
>> If you want to prefer a loop over observations here, that's your prerogative.
>>
>> (To concentrate on one specific code question, I have left in your tolerance.)
>>
>> Nick
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/