There are two issues here: what to calculate and
how to do it. Eric's example presumes two
estimates for each combination of state, county, year
and wanting to find the difference between them.
Evidently this could arise, but on the face of it
I would guess rather at
bysort state county (year) : gen diff = emp - emp[_n-1]
i.e. the difference between each year and the previous.
A more robust approach would be to -tsset-
egen countyid = group(state county), label
tsset countyid year
gen diff = D.emp
Nick
[email protected]
Eric G. Wruck
> You were close but your generate (gen) statement wasn't quite right.
>
>
> . bysort year state county: gen employdiff = employment -
> employment[_n - 1]
> (2 missing values generated)
>
> . l, noobs
>
> +---------------------------------------------+
> | year state county employ~ employ~f |
> |---------------------------------------------|
> | 1 1 1 10 . |
> | 1 1 1 15 5 |
> | 2 2 1 20 . |
> | 2 2 1 30 10 |
> +---------------------------------------------+
> >My data is structured as follows
> >
> >year state county employment
> >1 1 1 10
> >2 1 1 20
> >1 2 1 15
> >2 2 1 30
> >...
> >for 6 years, 50 states, and some counties in each state. I
> have 1.5 million observations.
> >
> >I want to construct a variable that is the difference in
> employment by year in each state and county.
> >
> >I tried
> >
> >by year state county, sort: gen newvar =
> employment-employment[_n-1] but that didn't work.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/