Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Looping Enquiry
From
George Bouliotis <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Looping Enquiry
Date
Thu, 29 Sep 2011 17:49:39 +0100
Dear Robert, Nick, Valerie and Richard
Thank you for your response to my question and your fruitful comments. Yes, my problem is programming the running-product estimation.
I have already been through different approaches and ideas (e.g. egen with product function or sums) but nothing worked correctly. The "lagged" approach seems promising but the tricky point is when the (sub)products should be 1
In an attempt to make things more clear, I provide a more clear dataset here:
time group indic RESULT step1 step2 step3 step4
30 1 0 -.958 47 48 .9791667 1-2*(47/48) =-0.958
57 0 0 -.916 46 47 .9787234 1-2*((46/47)*(47/48)) =-0.916
58 0 0 -.874 45 46 .9782609 1-2*((45/46)*(46/47)*(47/48)) =-0.874
67 0 0 -.834 44 45 .9777778 1-2*((44/45)*(45/46)*(46/47)*(47/48)) =-0.834
74 0 0 -.792 43 44 .9772727 1-2*((43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =-0.792
79 0 0 -.75 42 43 .9767442 1-2*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =-0.750
79 1 1 .125 42 43 .9767442 1-1*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.125
82 1 1 .125 42 43 .9767442 1-1*((42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.125
89 0 0 -.708 41 42 .9761904 1-2*((41/42)*(42/43)*(43/44)*(44/45)*(45/46)*(46/47)*(47/48)) =0.708
95 1 0 -.662 40 41 .9756098 and so on...
98 0 0 -.678 39 40 .975
101 0 0 -.574 38 39 .974359
104 0 0 -.532 37 38 .9736842
110 0 0 -.448 36 37 .972973
118 0 0 -.444 35 36 .9722222
The task is to generate correctly the "step4" variable which varies conditional upon the value of the dummy variable "indic". If indic is 0 then ste4 is calculated as 1-2*(products) and if indic is 0 then step for is estimated as 1-1*(products). The variable (column) RESULT simply illustrates what the correct values should be for the variable "step4".
Thank you very much
George
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Robert Picard
Sent: 29 September 2011 17:40
To: [email protected]
Subject: Re: st: Looping Enquiry
It looks like George is stuck on calculating a running product,
similar to the sum() function. This can easily be done without a loop:
gen rp = _n
replace rp = rp * rp[_n-1] if _n > 1
Since George is trying to replicate the RESULT variable and he's not
using the time and group variables, I suspect that Richard's solution
is not what George is looking for. While I can't quite figure out
everything, here's an attempt that get close:
*----------- begin example -------------
clear
input time group var RESULT
30 1 0 -0.958
57 0 0 -0.916
58 0 0 -0.834
67 0 0 -0.874
74 0 0 -0.792
79 0 0 -0.750
79 1 1 0.125
82 1 1 0.125
89 0 0 -0.706
95 1 0 -0.662
98 0 0 -0.678
101 0 0 -0.574
104 0 0 -0.532
110 0 0 -0.448
118 0 0 -0.444
end
* setup, per OP
egen ssize=seq() if var==0,from(47) to(1)
replace ssize=ssize[_n-1] if ssize==.
gen product= (ssize/(ssize+1))
* runing product that does not change when var == 1
gen double rprod = cond(var,1,product)
replace rprod = rprod * rprod[_n-1] if _n > 1
* replicate RESULT
gen score = 1 - cond(var,1,2) * rprod
format %9.3g score
list, clean noobs
*------------ end example --------------
On Thu, Sep 29, 2011 at 10:24 AM, Nick Cox <[email protected]> wrote:
> This looks a neat solution, assuming that Richard has correctly understood the question.
>
> I'd add a thought that no harm would be done by putting the result of -generate- into a -double-. These numbers don't look problematic, but a little worry about loss of precision would do no harm.
>
> Nick
> [email protected]
>
> Richard Herron
>
> If I understand the question, I think you can do this without a loop.
> If you sort on group and time, then you can create a sequential time
> index, use -tsset-, and use lag operators to generate your product.
> Here's my attempt, please let me know if I got your question wrong.
>
> * begin code
> clear
> input time group var RESULT
> 30 1 0 -0.958
> 57 0 0 -0.916
> 58 0 0 -0.834
> 67 0 0 -0.874
> 74 0 0 -0.792
> 79 0 0 -0.750
> 79 1 1 0.125
> 82 1 1 0.125
> 89 0 0 -0.706
> 95 1 0 -0.662
> 98 0 0 -0.678
> 101 0 0 -0.574
> 104 0 0 -0.532
> 110 0 0 -0.448
> 118 0 0 -0.444
> end
>
> bysort group (time): generate time_seq = _n
> tsset group time_seq
> by group: generate observ = RESULT * l.RESULT * l2.RESULT * l3.RESULT
> * end code
>
> which produces
>
> . list, clean
>
> time group var RESULT time_seq observ
> 1. 57 0 0 -.916 1 .
> 2. 58 0 0 -.834 2 .
> 3. 67 0 0 -.874 3 .
> 4. 74 0 0 -.792 4 .5288082
> 5. 79 0 0 -.75 5 .4329761
> 6. 89 0 0 -.706 6 .3665241
> 7. 98 0 0 -.678 7 .2843288
> 8. 101 0 0 -.574 8 .2060666
> 9. 104 0 0 -.532 9 .1461699
> 10. 110 0 0 -.448 10 .0927537
> 11. 118 0 0 -.444 11 .0607414
> 12. 30 1 0 -.958 1 .
> 13. 79 1 1 .125 2 .
> 14. 82 1 1 .125 3 .
> 15. 95 1 0 -.662 4 .0099093
>
> .
>
> On Thu, Sep 29, 2011 at 09:22, George Bouliotis <[email protected]> wrote:
>
>> Although an old Stata user, currently I am doing my first steps in programming.
>>
>> One of the parts in my programme tries (unsuccessfully) to replicate the column RESULT below. The difficulty is in how to loop a sequential product as, for instance: observ4= obs4 X obs3 (lag1) X obs2 (lag2) X obs1 (lag1).
>>
>> I tried some loops with "forvalue" but none was successful. I would appreciate any help with this.
>
> [...]
>
>>
>> #####################################
>> set more off
>> clear
>> input time group var RESULT
>> 30 1 0 -0.958
>> 57 0 0 -0.916
>> 58 0 0 -0.834
>> 67 0 0 -0.874
>> 74 0 0 -0.792
>> 79 0 0 -0.750
>> 79 1 1 0.125
>> 82 1 1 0.125
>> 89 0 0 -0.706
>> 95 1 0 -0.662
>> 98 0 0 -0.678
>> 101 0 0 -0.574
>> 104 0 0 -0.532
>> 110 0 0 -0.448
>> 118 0 0 -0.444
>> end
>>
>>
>> list , clean
>>
>> //Generating Ssize variable
>> egen ssize=seq() if var==0,from(47) to(1)
>> replace ssize=ssize[_n-1] if ssize==.
>> list, noobs clean
>>
>>
>> //Generating product variable
>> gen product= (ssize/(ssize+1))
>>
>>
>> //Generating Score variable (PRODUCT)
>> gen score= 1-(2*product) in 1/1 if var==0
>> // for the first observation only
>>
>> //**REPLACEMENT A: when var==0
>> replace score= 1-(2*(product*product[_n-1])) if var==0 & score==.
>> // fine for the second obs only (correct formula for when var=0)
>>
>> //**REPLACEMENT B: when var==1
>> replace i1= 1-1*(score*score[_n-1]) if var==1
>> // fine for the second observ only (correct formula for when var=1)
>> // but instead of [_n-1] I need a loop for [_n-`n(lagged)'] with "forvalue" command?
>>
>> list, clean noobs
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/