Hello Roger,
could you please summarize what was the warning about? And, in
particular, whether it relates to "_prefix"-commands or to "_"-prefix
commands (where "_prefix_expand" would be an example of the former
and "_regress" an example of the latter).
Though the help for the "_prefix"-commands seems to be interesting, I
find it more exciting to learn about the commands which are not only
not documented, they are not even mentioned anywhere, not even in the
internet (google currently returns 0 links). Does anyone has an idea
of how the "_xt..." commands work? I mean these:
_xtarm
_xtmka
_xtmkz
_xtzw
_xtwhw
_xta2
Does anyone has a complete list of _all Stata commands and is willing
to present it to the community?
Thank you,
Sergiy
On 9/16/07, Newson, Roger B <[email protected]> wrote:
> Thanks to David Elliot, Mike Blasnik and David Airey for their very
> helpful and detailed replies to my query. These shall be used to inform
> the first Stata 10 update to -parmby-, when I have Stata 10.
>
> And thanks also to Vince Wiggins, who warned me (during the 13th UK
> Stata User Meeting last week) of the dangers of ordinary users trying to
> get too deep into the undocumented _prefix suite of commands, used
> internally by StataCorp for -statsby- and other prefixes. (In Stata,
> type
>
> whelp _prefix
>
> to find out more about these.)
>
> Best wishes
>
> Roger
>
>
> Roger Newson
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> UNITED KINGDOM
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email: [email protected]
> Web page: www.imperial.ac.uk/nhli/r.newson/
> Departmental Web page:
> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
> genetics/reph/
>
> Opinions expressed are those of the author, not of the institution.
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of David Elliott
> Sent: 14 September 2007 15:07
> To: [email protected]
> Subject: Re: st: Does Blasnik's Law apply to -use-?
>
> Being Stata users, we should approach this in a rigorous scientific
> fashion:
>
> X-----begin-----X
>
> program define intest
> version 9.0
>
> *! version 1.0.0 2007.09.13
> *! Simulate using part of file with in #/##
> *! by David C. Elliott
> *!
> *! using name of trial dataset
> *! postname specifies filename of postfile
> *! numblocks is number of file blocks to create
>
>
> syntax using/ ,POSTname(string) NUMblocks(int)
>
> local more `c(more)'
> set more off
>
> use `using', clear //Load first to eliminate any first pass caching
> effects
> local recblock = round(`c(N)'/`numblocks',1)
>
> tempname post
> postfile `post' double block float timein timeif using `postname',
> every(10) replace
>
> timer clear 1
> n di _n(2) "{txt}{col 11}{center 10:-- IF --}{center 10:-- IN --}" _n
> ///
> "{center 10:Block}{center 10:Time}{center 10:Time}" _n ///
> "{hline 30}"
> local lastblock = `c(N)' - `recblock'
> forvalues i=1(`recblock')`lastblock ' {
> local block = `i'
> foreach I in if in {
> if "`I'" == "in" {
> local ifin in `i'/`=`i'+`recblock''
> }
> else {
> local ifin if inrange(_n, `i',
> `=`i'+`recblock'')
> }
> timer on 1
> use `using' `ifin', clear
> timer off 1
> qui timer list 1
> local time`I' :display %5.2f round(`r(t1)',.01)
> timer clear 1
> }
> post `post' (`block') (`timein') (`timeif')
> n di "{res}{ralign 10:`block'}{ralign 10:`timeif'}{ralign
> 10:`timein'}"
> }
> postclose `post'
> set more `more'
> use `postname', clear
> lab var block "Record Block"
> lab var timein "Load Time using IN"
> lab var timeif "Load Time using IF"
> tw line timein block || line timeif block
> end
>
> X-----end-----X
>
> eg:
>
> . intest using dss_data_06_07.dta , postname(intest.dta) numblocks(100)
>
>
> -- IN -- -- IF --
> Block Time Time
> ------------------------------
> 1 0.64 0.88
> 17278 0.47 0.77
> 34555 0.47 0.77
> 51832 0.47 0.78
> 69109 0.45 0.78
> 86386 0.45 0.78
> 103663 0.47 0.78
> 120940 0.47 0.77
> ...
>
> This adofile will run an -if- versus -in- simulation and graph the
> results. From my findings I can confirm a speed advantage of about
> 50% using -in- on dataset with obs:1,727,673 vars:28 size:266,061,642
>
> However, things get murkier. Run a simulation, then max out Stata's
> memory setting with as much memory as the system will give you and run
> the simulation again. When you do this, you eliminate the system's
> ability to cache the file. Ordinarily, subject to filesize and
> available memory, Stata may be reading the file from cache. If this
> is the case, one will see an advantage to using -in-. However, if the
> caching advantage is eliminated by increasing Stata memory, my
> simulations show the speed reduction using -in- is negated. I also
> tested this on large network databases and was unable to demonstrate
> any advantage to -in-.
>
> So back to Roger's initial question. It would appear that for
> cacheable filesizes and large numbers of bygroups a strategy using
> -in- might be feasible. There is an overhead penalty of setting up
> the bygroups to make them selectable using -in- involving sorts and
> the like. For a small number of bygroups the speed advantages might
> be lost, but for many levels and a large number of iterations there
> would be an advantage.
>
> DC Elliott
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/