Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Average of last two previous observations conditional on another variable value
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Average of last two previous observations conditional on another variable value
Date
Mon, 24 Mar 2014 14:06:37 +0000
I struggled with this mightily until I got a better way to think about it.
What's clear is that a loop over observations is the solution of last
resort, although long-term readers of this list should be able to
recall that that doesn't rule out such a solution being offered....
The trick is that the two previous values you want are easy to
identify *once you have a dataset that contains only positive values
for -dummy-*.
In your example -id- is redundant given -year- and is manifestly not a
panel identifier. So, I faked a panel identifier for a single panel.
The code should work with an equivalent real panel identifier.
. clear
. input id year dummy y average
id year dummy y average
1. 1 1990 0 10 .
2. 2 1991 0 11 .
3. 3 1992 1 12 .
4. 4 1993 1 13 .
5. 5 1994 0 14 12.5
6. 6 1995 0 15 12.5
7. 7 1996 1 16 12.5
8. 8 1997 1 17 14.5
9. 9 1998 1 18 16.5
10. 10 1999 0 19 17.5
11. end
. gen panelid = 42
. save mymaster
file mymaster.dta saved
. keep if dummy
(5 observations deleted)
. bysort panelid (year) : gen time = _n
. tsset panelid time
panel variable: panelid (strongly balanced)
time variable: time, 1 to 5
delta: 1 unit
. gen work = (y + L.y)/2
(1 missing value generated)
. keep panelid year work
. merge 1:1 panelid year using mymaster
Result # of obs.
-----------------------------------------
not matched 5
from master 0 (_merge==1)
from using 5 (_merge==2)
matched 5 (_merge==3)
-----------------------------------------
. tsset panelid year
panel variable: panelid (strongly balanced)
time variable: year, 1990 to 1999
delta: 1 unit
. replace work = L.work if missing(work)
(3 real changes made)
. gen newaverage = L.work
(4 missing values generated)
. drop _merge work
. list
+-------------------------------------------------------+
| year panelid id dummy y average newave~e |
|-------------------------------------------------------|
1. | 1990 42 1 0 10 . . |
2. | 1991 42 2 0 11 . . |
3. | 1992 42 3 1 12 . . |
4. | 1993 42 4 1 13 . . |
5. | 1994 42 5 0 14 12.5 12.5 |
|-------------------------------------------------------|
6. | 1995 42 6 0 15 12.5 12.5 |
7. | 1996 42 7 1 16 12.5 12.5 |
8. | 1997 42 8 1 17 14.5 14.5 |
9. | 1998 42 9 1 18 16.5 16.5 |
10. | 1999 42 10 0 19 17.5 17.5 |
+-------------------------------------------------------+
For convenience here is the code as a block, with commentary.
* set up example
clear
input id year dummy y average
1 1990 0 10 .
2 1991 0 11 .
3 1992 1 12 .
4 1993 1 13 .
5 1994 0 14 12.5
6 1995 0 15 12.5
7 1996 1 16 12.5
8 1997 1 17 14.5
9 1998 1 18 16.5
10 1999 0 19 17.5
end
* we need a panel identifier, here just for a single panel
gen panelid = 42
save mymaster
* reduce the dataset to just -dummy==1- and -tsset- with pseudotime
keep if dummy
bysort panelid (year) : gen time = _n
tsset panelid time
* we want the average of this y and the previous y: works w/ panels too
gen work = (y + L.y)/2
* now -merge- back and -tsset- as usual
keep panelid year work
merge 1:1 panelid year using mymaster
tsset panelid year
* fill in gaps; it's just carry forward
replace work = L.work if missing(work)
* finishing tape in sight; mine's a red wine
gen newaverage = L.work
drop _merge work
list
Nick
[email protected]
On 24 March 2014 13:15, Irene Ferrari <[email protected]> wrote:
> Dear Statalisters,
>
> I am a new statalister and I would greatly appreciate if you could
> give me advice on something which seems trivial to explain, less
> trivial to translate in Stata code.
>
> I have a panel dataset, and I would like to perform the average
> (variable "average") of the last two previous in time observations of
> "y" whose corresponding dummy variable ("dummy") is equal to 1.
>
> Example:
>
> --------------------------------------------------------------
> id year dummy y average
> 1 1990 0 10 .
> 2 1991 0 11 .
> 3 1992 1 12 .
> 4 1993 1 13 .
> 5 1994 0 14 12.5
> 6 1995 0 15 12.5
> 7 1996 1 16 12.5
> 8 1997 1 17 14.5
> 9 1998 1 18 16.5
> 10 1999 0 19 17.5
> --------------------------------------------------------------
>
> So, for example, when computing the average for year 1996, I need the
> programme to search back the two previous y such that the
> corresponding dummy is equal to 1, which are y==12 and y==13, and then
> compute the average.
>
> Here is the latest code (of many) I was trying, with no success (it
> seems stuck in an infinite loop...):
>
> replace average = .
> local k=1
> local r=0
> local N = _N
> local z=1
> forvalues i = 1/`N' {
> while `k'<=2 {
> if dummy[`i'-`z'] == 1 {
> local r=
> `r'+ y[`i'-`z']
> local k=`k'+1
> local z=`z'+1
> }
> else {
> local z=`z'+1
> }
> }
> replace average=(`r'/2) in `i'
> local k=1
> local r=0
> local z=1
> }
>
>
> I have never needed to perform complicated programming in Stata, so it
> is totally possible I am missing something obvious.
> Thanks for your consideration.
>
>
> Irene Ferrari
> PhD Student - Department of Economics
> University of Bologna
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/