Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Regression by industry and year excluding firm i
From
"Sarah Edgington" <[email protected]>
To
<[email protected]>
Subject
RE: st: Regression by industry and year excluding firm i
Date
Fri, 13 Dec 2013 11:41:13 -0800
Ahmed,
As an aside, this is strikes me as one of those instances where you would
benefit a great deal from debugging your code on a subset of your data. You
need enough data for your regressions to run without errors but I'd try
getting the loop working on a subset of a few hundred observations rather
than the whole data set. That will run much more quickly. The resulting
predictions will be nonsense but they'll serve as a proof of concept. Once
you're happy that you have code that does what you expect you can run it on
the whole dataset with a certain amount of confidence that even if it takes
a very long time, you'll get the results that reflect your intended process.
-Sarah
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Fernando Rios
Avila
Sent: Friday, December 13, 2013 11:21 AM
To: [email protected]
Subject: Re: st: Regression by industry and year excluding firm i
Ahmed,
In addition to Nick Cox comments, keep in mind that based on your
explanation, you need to run 95000 regressions. which will be very time
consuming. But, computer time is "cheap".
I would suggest, however, to clarify if each observation represent a
different Firm, which is assumption on how your code and Nick's are handling
the problem.
Fernando
HTH
On Fri, Dec 13, 2013 at 2:12 PM, Nick Cox <[email protected]> wrote:
> Sorry, no.
>
> The code hasn't finished running, so
>
> 1. Good news. No obvious bug.
>
> 2. I'd expect that code to be slow. You want a regression for every
> observation.
>
> I don't think you've demonstrated anything wrong with my code, so I
> can't possibly fix it. That doesn't mean the code must be right, but
> you need to show me incorrect results first. The point is that your
> code would, I imagine, have been even slower had it been correct.
> Several of the changes I made would have speeded up things compared
> with your code.
>
> I don't have your data to test anything, but without wanting to seem
> arrogant, I think you need to be confident that I made a mistake
> before you change my code.
>
> Nick
> [email protected]
>
>
> On 13 December 2013 19:01, Abdalla, Ahmed <[email protected]> wrote:
>> Dear Nick
>> Many Thanks for that.
>> I understand your code now. I ran it. However, STATA has been running the
loop for more than 40 minutes now and I got no output !!!
>> I will explain more:
>> I have a model:
>> wce= b0+b1wlag_ce+b2 wato+b3 wlag_acc +b4wacc+b5 wdsale+b6 wndsale
>>
>> I want to run this model using all observations in a particular industry
-year excluding firm i. Expected wce for firm i are measured using the
coefficients I obtain from the industry year regressions multiplied by the
actual values of the variables in the model for firm i.
>> As far as I understand your code should achieve my target, but it took
long time and didn't give any results !
>> I even tried another code that worked well and give me results in
seconds, but it doesn't exclude firm i from the estimation. I will write
this code for you here:
>> egen sic2id=group(sic_2 datadate)
>> egen count=count(sic2id), by(sic2id)
>> drop if count<10
>> drop count
>> drop sic2id
>> egen sic2id=group(sic_2 datadate)
>>
>> gen b0=.
>> gen b1= .
>> gen b2=.
>> gen b3=.
>> gen b4=.
>> gen b5=.
>> gen b6=.
>>
>> sum sic2id
>> scalar max2=r(max)
>> local k=max2
>> set more off
>> forvalues x=1(1)`k'{
>> capture reg wce wlag_ce wato wlag_acc wacc wdsale wndsale if sic2id==`x'
>> capture replace b0= _b[_cons]
>> capture replace b1= _b[wlag_ce]
>> capture replace b2= _b[wato]
>> capture replace b3= _b[wlag_acc]
>> capture replace b4= _b[wacc]
>> capture replace b5= _b[wdsale]
>> capture replace b6= _b[wndsale]
>> }
>>
>> I appreciate if you can explain what was wrong with your code and update
the new code I have posted here to exclude firm i.
>>
>>
>>
>>
>> ________________________________________
>> From: [email protected]
>> <[email protected]> on behalf of Nick Cox
>> <[email protected]>
>> Sent: 13 December 2013 18:03
>> To: [email protected]
>> Subject: Re: st: Regression by industry and year excluding firm i
>>
>> Remarks
>>
>> 1. If you are cycling over observations, you don't need a variable
>> containing observation numbers, nor to use -levelsof-.
>>
>> 2. -in- is always faster than the corresponding -if-.
>>
>> 3. wlag_ce=!=. is presumably a typo, but to Stata it will be illegal
syntax.
>>
>> 4. -capture replace b0= _b[_cons]- will end with the last intercept
>> calculated. I guess you don't want that.
>>
>> 5. Checking for missing values is redundant as -regress- will never
>> include them.
>>
>> With these and some other small tricks, here is an attempt at
>> rewriting your code.
>>
>> local X wlag_ce wato wlag_acc wacc wdsale wndsale tokenize "`X'"
>>
>> forval j = 0/6 {
>> gen b`j'=.
>> }
>>
>> forval i = 1/`=_N' {
>> local same sic_2[`i'] == sic_2 & datadate[`i'] == datadate qui count
>> if `same' & _n != `i'
>>
>> if r(N) > 10 {
>> reg wce `X' if `same' & _n != `i'
>> }
>>
>> quietly if _rc == 0 {
>> replace b0 = _b[_cons] in `i'
>> forval j = 1/6 {
>> replace b`j' = _b[``j''] in `i'
>> }
>> }
>> }
>>
>> gen pred_ce= b0 + b1*wlag_ce + b2*wato + b3*wlag_acc + /// b4*wacc +
>> b5*wdsale + b6*wndsale
>>
>> Nick
>> [email protected]
>>
>>
>> On 13 December 2013 17:33, Abdalla, Ahmed <[email protected]>
wrote:
>>> Dear Statalist
>>> I run a regression to estimate core earnings for each variable in my
dataset. The regression is run using all observations in a particular
industry year EXCLUDING firm i. Expected core earnings for firm i is
estimated using the coefficients multiplied by the actual values of
variables in the model for firm i.
>>> I run the following code.
>>>
>>> First: I get an error message for macro length being exceeded.
>>> Second: I try to use other commands for looping, the loop runs but it
gives me error message for invalid syntax.
>>> My problem is on how to exclude firm i ? I hope if you have any
suggestions regarding running regressions by industry and year and excluding
firm i from the estimation procedures.
>>>
>>>
>>> gen obs= [_n]
>>> gen runn=1
>>>
>>> gen b0=.
>>> gen b1= .
>>> gen b2=.
>>> gen b3=.
>>> gen b4=.
>>> gen b5=.
>>> gen b6=.
>>>
>>> levelsof obs,local(levels)
>>> foreach x of local levels{
>>> gen mark=1 if obs==runn
>>> gen sic_lp= sic_2 if obs ==runn
>>> qui summ sic_lp
>>> replace sic_lp = r(mean) if sic_lp==.
>>> gen datadate_lp= datadate if obs == runn qui summ datadate_lp
>>> replace datadate_lp = r(mean) if datadate_lp==.
>>> format datadate_lp %d
>>> gen sample =1 if sic_lp== sic_2 & datadate_lp== datadate & sale !=. &
wce !=. & wlag_ce=!=. & wato !=. & wacc !=. & wlag_acc!=. & wdsale !=. &
wndsale !=.
>>> egen sample_sum= sum(sample) if mark != 1 capture reg wce wlag_ce
>>> wato wlag_acc wacc wdsale wndsale if sample==1 & mark != 1 &
>>> sample_sum >10 capture replace b0= _b[_cons] capture replace b1=
>>> _b[wlag_ce] if obs==runn capture replace b2= _b[wato] if obs==runn
>>> capture replace b3= _b[wlag_acc] if obs==runn capture replace b4=
>>> _b[wacc] if obs==runn capture replace b5= _b[wdsale] if obs==runn
>>> capture replace b6= _b[wndsale] if obs==runn drop mark sic_lp
>>> datadate_lp sample sample_sum replace runn= runn+1 }
>>>
>>> gen pred_ce= b0+ b1*wlag_ce + b2*wato +b3*wlag_acc + b4*wacc +
>>> b5*wdsale + b6*wndsale
>>>
>>>
>>> I appreciate your help
>>>
>>>
>>>
>>>
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/