Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Average of last two previous observations conditional on another variable value
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Average of last two previous observations conditional on another variable value
Date
Mon, 24 Mar 2014 16:14:24 +0000
The code I gave can be simplified.
bysort panelid (year) : gen time = _n
tsset panelid time
gen work = (y + L.y)/2
can be reduced to
bysort panelid (year) : gen work = (y + y[_n-1])/2
As Pascal said, I would have written a shorter book given more time.
Nick
[email protected]
On 24 March 2014 14:06, Nick Cox <[email protected]> wrote:
> I struggled with this mightily until I got a better way to think about it.
>
> What's clear is that a loop over observations is the solution of last
> resort, although long-term readers of this list should be able to
> recall that that doesn't rule out such a solution being offered....
>
> The trick is that the two previous values you want are easy to
> identify *once you have a dataset that contains only positive values
> for -dummy-*.
>
> In your example -id- is redundant given -year- and is manifestly not a
> panel identifier. So, I faked a panel identifier for a single panel.
> The code should work with an equivalent real panel identifier.
>
> . clear
>
> . input id year dummy y average
>
> id year dummy y average
> 1. 1 1990 0 10 .
> 2. 2 1991 0 11 .
> 3. 3 1992 1 12 .
> 4. 4 1993 1 13 .
> 5. 5 1994 0 14 12.5
> 6. 6 1995 0 15 12.5
> 7. 7 1996 1 16 12.5
> 8. 8 1997 1 17 14.5
> 9. 9 1998 1 18 16.5
> 10. 10 1999 0 19 17.5
> 11. end
>
> . gen panelid = 42
>
> . save mymaster
> file mymaster.dta saved
>
> . keep if dummy
> (5 observations deleted)
>
> . bysort panelid (year) : gen time = _n
>
> . tsset panelid time
> panel variable: panelid (strongly balanced)
> time variable: time, 1 to 5
> delta: 1 unit
>
> . gen work = (y + L.y)/2
> (1 missing value generated)
>
> . keep panelid year work
>
> . merge 1:1 panelid year using mymaster
>
> Result # of obs.
> -----------------------------------------
> not matched 5
> from master 0 (_merge==1)
> from using 5 (_merge==2)
>
> matched 5 (_merge==3)
> -----------------------------------------
>
> . tsset panelid year
> panel variable: panelid (strongly balanced)
> time variable: year, 1990 to 1999
> delta: 1 unit
>
> . replace work = L.work if missing(work)
> (3 real changes made)
>
> . gen newaverage = L.work
> (4 missing values generated)
>
> . drop _merge work
>
> . list
>
> +-------------------------------------------------------+
> | year panelid id dummy y average newave~e |
> |-------------------------------------------------------|
> 1. | 1990 42 1 0 10 . . |
> 2. | 1991 42 2 0 11 . . |
> 3. | 1992 42 3 1 12 . . |
> 4. | 1993 42 4 1 13 . . |
> 5. | 1994 42 5 0 14 12.5 12.5 |
> |-------------------------------------------------------|
> 6. | 1995 42 6 0 15 12.5 12.5 |
> 7. | 1996 42 7 1 16 12.5 12.5 |
> 8. | 1997 42 8 1 17 14.5 14.5 |
> 9. | 1998 42 9 1 18 16.5 16.5 |
> 10. | 1999 42 10 0 19 17.5 17.5 |
> +-------------------------------------------------------+
>
> For convenience here is the code as a block, with commentary.
>
> * set up example
> clear
> input id year dummy y average
> 1 1990 0 10 .
> 2 1991 0 11 .
> 3 1992 1 12 .
> 4 1993 1 13 .
> 5 1994 0 14 12.5
> 6 1995 0 15 12.5
> 7 1996 1 16 12.5
> 8 1997 1 17 14.5
> 9 1998 1 18 16.5
> 10 1999 0 19 17.5
> end
>
> * we need a panel identifier, here just for a single panel
> gen panelid = 42
> save mymaster
>
> * reduce the dataset to just -dummy==1- and -tsset- with pseudotime
> keep if dummy
> bysort panelid (year) : gen time = _n
> tsset panelid time
>
> * we want the average of this y and the previous y: works w/ panels too
> gen work = (y + L.y)/2
>
> * now -merge- back and -tsset- as usual
> keep panelid year work
> merge 1:1 panelid year using mymaster
> tsset panelid year
>
> * fill in gaps; it's just carry forward
> replace work = L.work if missing(work)
>
> * finishing tape in sight; mine's a red wine
> gen newaverage = L.work
> drop _merge work
> list
>
> Nick
> [email protected]
>
>
> On 24 March 2014 13:15, Irene Ferrari <[email protected]> wrote:
>> Dear Statalisters,
>>
>> I am a new statalister and I would greatly appreciate if you could
>> give me advice on something which seems trivial to explain, less
>> trivial to translate in Stata code.
>>
>> I have a panel dataset, and I would like to perform the average
>> (variable "average") of the last two previous in time observations of
>> "y" whose corresponding dummy variable ("dummy") is equal to 1.
>>
>> Example:
>>
>> --------------------------------------------------------------
>> id year dummy y average
>> 1 1990 0 10 .
>> 2 1991 0 11 .
>> 3 1992 1 12 .
>> 4 1993 1 13 .
>> 5 1994 0 14 12.5
>> 6 1995 0 15 12.5
>> 7 1996 1 16 12.5
>> 8 1997 1 17 14.5
>> 9 1998 1 18 16.5
>> 10 1999 0 19 17.5
>> --------------------------------------------------------------
>>
>> So, for example, when computing the average for year 1996, I need the
>> programme to search back the two previous y such that the
>> corresponding dummy is equal to 1, which are y==12 and y==13, and then
>> compute the average.
>>
>> Here is the latest code (of many) I was trying, with no success (it
>> seems stuck in an infinite loop...):
>>
>> replace average = .
>> local k=1
>> local r=0
>> local N = _N
>> local z=1
>> forvalues i = 1/`N' {
>> while `k'<=2 {
>> if dummy[`i'-`z'] == 1 {
>> local r=
>> `r'+ y[`i'-`z']
>> local k=`k'+1
>> local z=`z'+1
>> }
>> else {
>> local z=`z'+1
>> }
>> }
>> replace average=(`r'/2) in `i'
>> local k=1
>> local r=0
>> local z=1
>> }
>>
>>
>> I have never needed to perform complicated programming in Stata, so it
>> is totally possible I am missing something obvious.
>> Thanks for your consideration.
>>
>>
>> Irene Ferrari
>> PhD Student - Department of Economics
>> University of Bologna
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/