Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: xt: unit-specific trends
From
László Sándor <[email protected]>
To
[email protected]
Subject
Re: st: xt: unit-specific trends
Date
Thu, 19 Apr 2012 10:11:30 -0400
Thanks, Nick, Austin,
quickly on this: Before I rewrite my code again, I would appreciate a
comment from StataCorp. I used "if `touse'" because that is the
official way to make a program byable
(http://www.stata.com/help.cgi?byable). If there is any case where the
-if- condition need not be checked for the entire dataset, a -by: -
run is that, isn't it? (By the way, as a -by- always run when sorted
on the "by" variable, it is really prone to range subscripts.) So I
expect that Stata takes care of that when runs a byable command with
-bys: -.
I am not sure my Mata code could be much better than a well-written
-by:-, so first I would hear about whether -bys: - is written well. :)
And my impression is that to drop _N-10 observations but then restore
after a preserve is prohibitively costly.
Thanks,
Laszlo
On Thu, Apr 19, 2012 at 9:53 AM, Nick Cox <[email protected]> wrote:
>
> As I understand it:
>
> Work with -if- is really dumb. Stata tests every observation. Even if
> you say -if _n < 7- there is no intelligence saying "Oh, we're done
> here" once you get to observation 7. How could there be? In that
> example and some others working with -in- is a much, much better idea.
> However, you're working with -if `touse'- here. One alternative is
>
> preserve
> keep if `touse'
> ...
> restore
>
> but you still have to use -if-. It could be that
>
> replace `touse' = -`touse'
> sort `touse'
> qui count if `touse'
> local last = r(N)
>
> and then doing things
>
> ... in 1/`last'
>
> helps.
>
> As for other stuff, I think most of the answers would be "It depends".
>
> On what MP does and doesn't do, see
> http://www.stata.com/statamp/statamp.pdf
>
> Over time, the Mata content of Stata is going up, but many of the
> basic operations are done using compiled C.
>
> I think Stata tech-support would need much more information than this
> on your data, your code and your machine before they could make many
> useful comments.
>
> More positively, you could use -set rmsg on- to see what is going most
> slowly.
>
> Nick
>
> 2012/4/19 László Sándor <[email protected]>:
> > Quick comments on this:
> >
> > I forgot to flag that the residual variable need to exist beforehand
> > for -genbump- below, this is only replacing values of it.
> >
> > More importantly: The operation is still far, far from linear in the
> > number of individuals (N in the panel — T is fixed). I could again
> > finish a 1% subsample in around 10 minutes or so, but my bold attempt
> > at 10% overnight still only finished 4 out of the 8 variables to be
> > transformed this way in 10 or 11 hours.
> >
> > Maybe caching and memory is an issue here, but if anybody (StataCorp?)
> > had a comment on this otherwise, that would be helpful.
> >
> > Maybe firing up _regress and _predict all the time is very costly? Or
> > the marksample is not fast enough with the by option? (Does the code
> > know that once it finished with seven consecutive rows there is
> > nothing to check further below "whether" `touse' is 1 anywhere else? I
> > guessed byable commands produce efficient subscripting for some
> > underlying Mata code…) Or even the byable command does not use MP
> > resources efficiently? (Still, even remaining serial, the speed-up
> > could be much closer to linear, no?)
> >
> > I thought individual-specific trends are almost as trendy nowadays as
> > fixed-effects — I wonder if they could be done much faster.
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/