I have a panel dataset that looks something like this:
year gdpReal gdpNom gdpdefl PPI dPPI
1900 126 . . 25
1901 132 . . 27 .08
1902 138 . . 29 .074
1903 142 . . 31 .069
1904 147 41.16 28 32 .032
1905 150 48 32 34 .063
1906 151 49.83 33 35 .029
The variable I need is gdpNom. Normally, one would obtain this as:
gdpNom = gdpReal * (gdpdefl/100);
however, this is clearly not possible since gdpdefl is missing
anywhere gdpNom is missing.
So, I want to estimate gdpdefl using dPPI, something like this:
gen gdpdeflFILL = F.gdpdefl/(1+F.dPPI) if gdpdefl==.
gen gdpdeflEST = gdpdefl
replace gdpdeflEST = gdpdeflFILL if gdpdeflEST==.
This would allow me to estimate gdpNom:
gen gdpNomEST = gdpnom
replace gdpNomEST = gdpReal * (gdpdeflEST/100) if gdpNomEst==.
BUT, my brilliant plan falls apart because the forward lag operator
(F) cannot cascade the same way that the backward lag operator (L)
can, so I only, for example, get an estimate for 1903.
I feel like there must be an easy solution to this, but I am stuck.
My panel has 57 groups and 150 time periods, so I'd much rather not
do this by hand.
Any help would be much appreciated.
Thanks,
Paul Rivera