Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Extract year of maximum production
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Extract year of maximum production
Date
Sat, 20 Oct 2012 09:18:35 +0100
There is a review of working row-wise in
SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
Q1/09 SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced
which is accessible to all under the SJ's 3-year rule. -search
rowwise- to get a clickable link.
You have already alluded to two key principles in this territory
1. A -reshape long- can make a row-wise problem easier.
2. There are -egen- functions for row problems.
Even more important than #2 is
3. You can solve many problems with a loop over variables.
as #2 is just an application of #3.
With 3000-odd counties there may well be ties, i.e. counties for which
two or more years show the same maximum.
Let's be alert to that amd use a slightly odd technique. (Others will
be suggested by my paper referred to above.) Starting with your idea
egen maxprod = rowmax(timber????)
let's initialise a string variable
gen whenmax = ""
Now whenever we find a year when the maximum occurred we add that year
to our string
* isn't that 101 years? no matter
qui forval y = 1910/2010 {
replace whenmax = whenmax + "`y' " if timber`y' == maxprod
}
Note the extra space in the " " above.
In most counties there will be a single year when there was a maximum
gen when = real(whenmax) if wordcount(whenmax) == 1
And you can look at the cases with ties to think what you want to do with them:
l county whenmax if wordcount(whenmax) > 1
Notes.
To extract words from a string, use -word()-.
If you have counties with 0 values, they might as well be -mvdecode-d
to missing before you calculate your maximum. Extreme case: a county
with 0s in every year; iit is absurd to say that the maximum
production occurred in every year. (Or you might just -drop- such
counties.)
Similarly, this technique assumes that there aren't so many ties that
the years can't be fit in a -str244- variable and if that's wrong you
may need a different technique
Nick
On Sat, Oct 20, 2012 at 4:10 AM, Daniel Escher <[email protected]> wrote:
> I have 100 years of timber production data for all counties in the US
> (~3,100). The data are currently in wide format - i.e., timber1910,
> timber1911 ... timber2010 (but I can switch them to long if needed).
>
> I would like to extract the year of maximum timber production for each
> county and put that year in a variable called "peakyr." Two things I
> thought might be helpful: 1) I can use -egen newvar = rowmax...- to get the
> maximum value of a row. 2) I can separate the stub and the year in the
> variable name using the -substr- function. Unfortunately, I don't know how
> to make those two processes "talk" to each other - if they are even the
> right ones to use.
>
> Stata/IC 12.1 for Windows (32-bit)
> Revision 01 Oct 2012
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/