Thanks to Chris to very full feedback.
The problem was
Data: hourly barometric pressure data for many days
Variables: year month day hour pressure
So during any day the pressure rises and falls.
Aim: generate a new variable containing the maximum fall
within a day, from a peak to a trough. This is not the
same as the daily range.
This was my code, assuming installation of -tsspell-
from SSC,
egen panel = group(year month day)
tsset panel hour
tsspell , cond(F.press < press)
replace _spell = L._spell if L._end
egen max = max(pressure) if _spell, by(panel _spell)
egen min = min(pressure) if _spell, by(panel _spell)
egen range = max(max - min), by(panel)
where you need to install -tsspell- from SSC.
The corrected code is
egen panel = group(year month day)
tsset panel hour
tsspell , cond(F.press < press)
replace _spell = L._spell if L._end == 1
egen max = max(pressure) if _spell, by(panel _spell)
egen min = min(pressure) if _spell, by(panel _spell)
egen range = max(max - min), by(panel)
The bug was in assuming that because _end is generated
with values 0 and 1, you can use the short-cut
if L._end
for
if L._end == 1
But the short-cut doesn't work, as L._end will be
missing for the start of each panel, and so non-zero.
Nick
[email protected]
chris wallace
>
> Many thanks to David, Nikos, Scott and Nick who all responded
> to my post
> about a colleague's request to find the maximum drop in a variable
> during a given time period.
>
> Reading the answers was very instructive for me, especially as I tried
> and failed to solve the query before I posted.
>
> Unfortunately, none worked "out of the box", but were close
> enough that
> my colleague had now been able to work out her own solution.
>
> In summary:
>
> - Nikos appeared to work fine, but failed when there were two
> consecutive values that were equal. This we fixed by replacing
> by day:gen tri=-1*(change<0)+(change>0)
> with
> by day:gen tri=-1*(change<=0)+(change>0)
>
> - Nick's was good on brevity, and neat enough not to take
> much figuring
> out. But it failed when the drop began with the first
> observation in a
> day. This my colleague says can be fixed by replacing
>
> egen max = max(pressure) if _spell, by(panel _spell)
>
> with
>
> gen max = pressure if _seq ==1
>
> but I can't see why it wouldn't fail now if the maximum
> pressure wasn't
> the first observation of the day...
>
> - David's answer we liked lots, particularly for all the helpful
> comments! It appears to produce the right answer every time, but
> rounded to a whole number (although in my sample data in the email,
> pressure was all integer, in the real data it is float).
> This we fixed
> by dropping the "int" from
>
> by year month day runno: gen int extreme = pressure[_N]
>
> - Scott's I'm afraid we got a bit lost on. It doesn't always
> work right,
> but we can't figure out why it fails. (Sorry).
>
> Thanks again to all of you for taking the time to help on this. And
> apologies for listing the failures above like they were
> problems in your
> code. In fact, all the code worked fine for the simplified
> data I sent
> in my original email, and the task I set was a little unfair
> - expecting
> you to write code in anticipation of idiosyncrasies in a real
> dataset I
> didn't show you!
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/