Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: flagging significant values in a variable |
Date | Sat, 3 Mar 2012 17:39:26 +0000 |
On Sat, Mar 3, 2012 at 5:35 PM, Nick Cox <njcoxstata@gmail.com> wrote: > Partha: > > Threading is what your mailer does. It's not inherent in Statalist's > operation. (Archiving is a separate matter.) > > But that's of no consequence: we all overlook previous emails from > time to time. > > On the question of introducing a tolerance to the question, I stand by > my earlier comment. > > If I understand you correctly the code you posted earlier was really > intended as a code sketch and readers were expected to be perceptive > enough to realise that. That really wasn't clear to me. I would be > surprised if it was clear to anybody else But let's concentrate on the > code and spell out what your approach implies, a loop over > observations. > > forval i = 1/`=_N' > if lci[`i'] - natlci[`i'] >`tol' { > replace tag=1 in `i' > } > else if lci[`i']-natlci[`i'] < -`tol' { > replace tag= 2 in `i' > } > else { > replace tag = 0 in `i' > } > } > > However, this loop really isn't necessary as the whole thing can be > done in one line. > > replace tag = cond(lci - natlci >`tol', 1, 2 * (lci[`i']-natlci[`i'] < -`tol')) > > If that's over-compressed, there is a shorter version in about three > lines, similar in spirit to Graham's posting yesterday. That will be > much faster. > > If you want to prefer a loop over observations here, that's your prerogative. > > (To concentrate on one specific code question, I have left in your tolerance.) > > Nick > > On Sat, Mar 3, 2012 at 3:20 PM, Partho Sarkar <partho.ss+lists@gmail.com> wrote: >> True, I had overlooked the earlier solutions- because the question >> appears on 2 separate threads . I answered the one w/o any answers at >> the time, w/o having seen the other thread. ( Btw this raises some >> issues about duplicate threads, possibly unintendedly so, which often >> confuse! ) >> >> The code was just a sketch of an idea- I assumed (mistakenly perhaps) >> that the user would realize the need to qualify the if loops in >> practice (start off the loops with a foreach statement to loop through >> all the observations) . The tolerance given is also only an example! >> All this based on what I think is a legitimate interpretation of the >> original question! >> >> Partho >> >> On Sat, Mar 3, 2012 at 8:22 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>> This post overlooks earlier solutions posted yesterday. I see no need >>> to complicate anything by introduction of a tolerance, which seems >>> based on an idea that the rates are exact decimals to 4 d.p. >>> >>> Also, the code won't work as intended because it confuses the -if- >>> command and the -if- qualifier. >>> >>> FAQ . . . . . . . . . . . . . . . . . . . . . if command vs. if qualifier >>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Wernow >>> 6/00 I have an if command in my program that only seems >>> to evaluate the first observation, what's going on? >>> http://www.stata.com/support/faqs/lang/ifqualifier.html >>> >>> Nick >>> >>> On Sat, Mar 3, 2012 at 2:10 PM, Partho Sarkar <partho.ss+lists@gmail.com> wrote: >>>> Tim, >>>> >>>> I am afraid you haven't spelt it out very clearly! Based on one >>>> possible interpretation, this would be one way to do it (shown only >>>> for the LCI (renamed lci) variable): >>>> >>>> ---------------------------START CODE------------------------------------------- >>>> >>>> egen natlci=total(lci*(region==99)) // generates a value for each >>>> obs., equal to national value) >>>> local tol .0001 // define tolerance for "significantly lower or higher" >>>> gen byte tag= . >>>> if lci-natlci>`tol' { >>>> replace tag=1 >>>> } >>>> else if lci-natlci< -`tol' { >>>> replace tag= 2 >>>> } >>>> else { >>>> replace tag = 0 >>>> } >>>> >>>> ---------------------------END CODE------------------------------------------- >>>> >>>> Hope this helps >>>> >>>> Partho >>>> >>>> From Tim Evans <Tim.Evans@wmciu.nhs.uk> >>>> To "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> >>>> Subject st: flagging significant values in a variable >>>> Date Fri, 2 Mar 2012 09:24:46 +0000 >>>> >>>> Hi, >>>> >>>> I have a dataset that has variables of rates, LCI and UCI for a >>>> number of regions in addition to a national average (rate, LCI, UCI) >>>> so that it looks like this: >>>> >>>> rate LCI UCI region >>>> 0.9727 0.9583 0.9849 1 >>>> 0.9713 0.9523 0.9867 2 >>>> 0.9835 0.9667 0.9971 3 >>>> 0.9790 0.9741 0.9836 99 >>>> >>>> What I would like to do is generate a flag beside each row that >>>> will flag up entries where they are significantly higher (1) or lower >>>> (2) or not significantly different (0) to region 99 - I'm unsure as to >>>> the code here and would appreciate any advice. I'm using Stata 11.2. >>> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/