I remember something a while back about the -macro shift- usage not
scaling well. I don't' know if that's what's going on here, but
-desmat- does use -macro shift-.
--Nick Winter
-----------------------------------------------------------
Nicholas Winter, Ph.D. P 202.939.5343
Policy Studies Associates F 202.939.5732
1718 Connecticut Avenue, NW [email protected]
Washington, DC 20009-1148 www.policystudies.com
-----------------------------------------------------------
> -----Original Message-----
> From: Roger Harbord [mailto:[email protected]]
> Sent: Thursday, October 03, 2002 11:00 AM
> To: [email protected]
> Subject: RE: st: how to make xi dummies inherit labels
>
>
> That syntax works now, thanks.
> Still seem to have this weird speed problem though. Same
> thing happens
> using desmat as a command. Again only the first time I run
> desmat on my
> dataset - even if I subsequently run it on a different
> variable or drop the
> _x_* variables it creates. I don't understand that but then
> I haven't
> tried to understand what desmat is doing internally. I guess
> it must be
> storing something extra somewhere.
>
> Checked that if I -keep- only a few variables the problem
> goes away. It
> may be the problem only occurs with the stupidly large number
> of variables
> (over 1000) I have in my dataset (I didn't create it myself and I'm
> reluctant to spend any time on data management to cut it down).
>
> This is in fact not the first time I've experienced strange scaling
> behaviour in the time taken by stata to complete a command.
> I've been
> running some power simulations with 10000 simulations of a dataset
> containing 5-60 records, and found that if I hold the whole
> lot in memory
> at once and do something like:
>
> . forvalues i in 1(1)10000 { regress ... if simulation==i }
> (obviously a bit more to it than that to save the results)
>
> - things go *very* slowly - it only seemed to manage about 3
> regressions a
> second. Cutting the 10000 down to 1000 means the command
> completes not 10
> times faster, as you might expect, but 100 times faster! I ended up
> analysing chunks of the dataset at a time and also using -in-
> instead of
> -if-. Now my simulations take an hour or two instead of a day or two.
>
> I've been meaning to post something on that for a while but
> I haven't got
> time to properly document the problem at the moment.. Just
> to illustrate
> that the problem may be more general than -desmat- and could
> lie in deeper
> in the internal workings of stata.
>
> Maybe I really should drop all those variables I don't need and use
> -desmat-. It seems to do what I'm after (and a whole lot
> more..) I'm sure
> it would speed everything else up too (though other commands
> I'm using at
> present take a few seconds rather than a couple of minutes).
>
> Roger.
>
>
>
> --On 03 October 2002 06:37 -0700 John Hendrickx
> <[email protected]>
> wrote:
>
> > Hello once again,
> >
> > I've forgoten my own command syntax, it should be:
> >
> > desmat: logistic siweekT2 age10yy2, desrep(exp)
> >
> > There's an example on this in the help file although I
> suppose you do
> > have to know where to find it.
> >
> > As for the speed problems, I'm mystified. I just tried a
> dataset with
> > 20375 cases and 238 variables and that was no problem
> (although I did
> > have to increase matsize and memory). You might want to try
> desmat as
> > a command, see if that sheds some light on the problem:
> >
> > desmat age10yy2
> > logistic siweekT2 _x_*
> > desrep, exp
> > drop _x_*
> >
> > Of course, if you already have an alternative solution then there's
> > no need to waste any more time, but I'm curious about this speed
> > problem with desmat. Pretty strange.
> >
> > John Hendrickx
> >
> > --- Roger Harbord <[email protected]> wrote:
> >> Hi John,
> >>
> >> I've just installed the latest version of desmat available on SSC -
> >>
> >> Distribution-Date: 20011111. (I had the STB-61: dm73.3 version
> >> before.)However an -exp- option still doesn't exist:
> >>
> >> . desmat: logistic siweekT2 age10yy2, exp
> >> exp invalid
> >> r(198);
> >>
> >> . which desmat
> >> c:\ado\stbplus\d\desmat.ado
> >> *! version 3.0, 30Mar2001, [email protected]
> >>
> >> And I'm not including any continuous covariates - only a single
> >> categorical
> >> one with 6 categories at present. -desmat- takes around 2 minutes
> >> even if
> >> I give an outcome variable that doesn't exist so that all it gives
> >> is an
> >> error message to that effect. (If given a non-existent covariate
> >> it
> >> complains straight away though.)
> >>
> >> I suppose I could drop all those variables corresponding to
> >> questions that
> >> we're not using (data is results of a survey with a *long*
> >> questionnaire)
> >> but that would be some extra work to create and maintain a 'keep
> >> list' of
> >> variables I'm actually interested in.
> >>
> >> Roger.
> >>
> >>
> >> --On 03 October 2002 04:33 -0700 John Hendrickx
> >> <[email protected]>
> >> wrote:
> >>
> >> > Hi Roger,
> >> >
> >> > -desmat- should add a few seconds to your calculations but two
> >> > minutes is way too much. One explanation might be that a
> >> continuous
> >> > variable wasn't specified as such, then -desmat- will create
> >> dummies
> >> > for all 100+ categories and estimation will take a long time. Let
> >> me
> >> > know if -desmat- really slows things down that much on a large
> >> > dataset, maybe it would be worthwhile to create a lite version.
> >> >
> >> > As for exponential coefficients, use the -exp- option,
> >> >
> >> > desmat: logistic y x, exp
> >> >
> >> > will give the same results as
> >> >
> >> > xi: logistic y i.x
> >> >
> >> > -logistic- prints exponential coefficients but saves them as
> >> > loglinear values.
> >> >
> >> > Good luck,
> >> > John Hendrickx
> >> >
> >> > --- Roger Harbord <[email protected]> wrote:
> >> >> What I was really after in the end was similar to the output of
> >> >> e.g.
> >> >> . xi: logistic y i.x
> >> >> . reformat, eform
> >> >>
> >> >> - but with the coefficients labelled using the value labels
> >> >> assigned to x.
> >> >> -desmat- does achieve this, but I had a couple of different
> >> >> problems when I
> >> >> tried -desmat-:
> >> >>
> >> >> 1) It takes over 2 minutes to run the first univariable logistic
> >> >> regression
> >> >> with -desmat- on my data , when -xi- is seemingly instant. May
> >> be
> >> >> connected to the fact that my dataset has 1100 variables (and
> >> 2400
> >> >> observations). Much quicker subsequently though, even run on
> >> >> different
> >> >> variables.
> >> >>
> >> >> 2) I can't see how to get -desmat- to exponentiate the
> >> coefficients
> >> >> (to
> >> >> give odds ratios with logistic regression) when used as a
> >> command
> >> >> prefix:
> >> >>
> >> >> . desmat: logistic y i.x
> >> >>
> >> >> gives the same output as:
> >> >>
> >> >> . desmat: logit y i.x
> >> >>
> >> >> - and there's no -eform- option as there is with -outreg- and
> >> >> -reformat-.
> >> >>
> >> >> Also I think -reformat- or -outreg- give me more flexibility in
> >> >> deciding
> >> >> what I want in the output, so I don't need to do so much work on
> >> >> the output
> >> >> before I present it to my client, which is ultimately my aim.
> >> >>
> >> >> In conclusion i'll probably use Nick's 'canned solution' for
> >> >> transferring
> >> >> value labels to variable labels of dummies, in combination with
> >> >> -reformat-
> >> >> or -outreg-. But maybe it would be nice if there was an option
> >> for
> >> >> -xi- to
> >> >> tell it to inherit the labels in this way. Put that on the wish
> >> >> list for
> >> >> Stata 8...
> >> >>
> >> >>
> >> >> Roger.
> >> >> ----------------------------------------------------
> >> >> Roger Harbord mailto:[email protected]
> >> >> Department of Social Medicine, University of Bristol
> >> >>
> >> >>
> >> >>
> >> >> --On 03 October 2002 09:33 +0100 Nick Cox <[email protected]>
> >> >> wrote:
> >> >>
> >> >> > John Hendrickx
> >> >> >
> >> >> >> -desmat- will do this. Try -ssc describe desmat-
> >> >> >
> >> >> > I tried -desmat- after my posting. I couldn't
> >> >> > see that it did quite this.
> >> >> >
> >> >> > Example:
> >> >> >
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > -------------------------------------
> >> >> > log: C:\Stata7\desmat.log
> >> >> > log type: text
> >> >> > opened on: 3 Oct 2002, 09:30:21
> >> >> >
> >> >> > . u auto
> >> >> > (1978 Automobile Data)
> >> >> >
> >> >> > . desmat : regress mpg foreign
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > ---------
> >> >> > regress
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > ---------
> >> >> > < snip >
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > ---------
> >> >> > nr Effect
> >> >> Coeff
> >> >> > s.e.
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > ---------
> >> >> > foreign
> >> >> > 1 Foreign
> >> >> 4.946**
> >> >> > 1.362
> >> >> > 2 _cons
> >> >> 19.827**
> >> >> > 0.743
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >> >> > ---------
> >> >> > * p < .05
> >> >> > ** p < .01
> >> >> >
> >> >> > . d _x_1
> >> >> >
> >> >> > storage display value
> >> >> > variable name type format label variable label
> >> >> >
> >> >>
> >> >
> >>
> >
> ----------------------------------------------------------------------
> >>
> > === message truncated ===
> >
> >
> > __________________________________________________
> > Do you Yahoo!?
> > New DSL Internet Access from SBC & Yahoo!
> > http://sbc.yahoo.com
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/