Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Quantile Regression
From
"JVerkuilen (Gmail)" <[email protected]>
To
[email protected]
Subject
Re: st: Quantile Regression
Date
Tue, 2 Oct 2012 22:12:59 -0400
On Tue, Oct 2, 2012 at 7:31 PM, Steve Samuels <[email protected]> wrote:
>
> Without details (see FAQ 3.3 first sentence), we can only guess. This
> could happen if 1) you did not set the same random seed before each
> -sqreg- and -bsqreg- command; 2) the number of bootstrap replicates
> differed between -sqreg- and -bsqreg- runs; or 3) -sqreg- does not
> rejects replicates in which convergence failed for any quantile.
If the standard errors are different it's no great surprise if you're
running bootstrap. All the stuff said makes sense. Check on a known
dataset (such as auto) and fix the seed.
> By the way, the manual states that -sqreg- is faster than -bsqreg-.
I believe that computationally there are some speedups due to the fact
that the linear program can be solved for one and simply updated to
get the rest of the quantiles, but I could be mistaken. Roger
Koenker's book (Quantile Regression, Oxford University Press, 2006)
discusses computation in detail. Also there are analytic options to
bootstrapping that might be much faster. -qreg- generates standard
errors analytically using a weighting matrix and density estimator of
the residuals.
. sysuse auto
. qreg price mpg
Median regression Number of obs = 74
Raw sum of deviations 142205 (about 4934)
Min sum of deviations 129521.7 Pseudo R2 = 0.0892
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -135.6667 67.26576 -2.02 0.047 -269.7585 -1.574816
_cons | 8088.667 1483.808 5.45 0.000 5130.749 11046.58
------------------------------------------------------------------------------
. bsqreg price mpg, reps(999) *note that bsqreg defaults
to 20!?!?!?!
Median regression, bootstrap(999) SEs Number of obs = 74
Raw sum of deviations 142205 (about 4934)
Min sum of deviations 129521.7 Pseudo R2 = 0.0892
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -135.6667 35.63527 -3.81 0.000 -206.7043 -64.62906
_cons | 8088.667 889.0486 9.10 0.000 6316.381 9860.953
------------------------------------------------------------------------------
In this case the standard errors are markedly different and playing
with the different methods in -qreg- gives quite different values, but
I don't really know enough to be able to comment on why. I am inclined
to trust the bootstrapped ones because this problem has a rather small
N.
I suspect that it is very slow on a huge problem though, given that it
needs to sort the residuals. Koenker did a good deal of work on
alternatives such as inverting a test of some sort; I think the R
implementation of quantile regression has this. Again see his book.
> I've never had the luxury of having so many observations to analyze. I
> imagine that almost every simple model can be rejected, so that model
> building and validation are real challenges.
Randomly subsample and do a real cross validation?
Jay
--
JVVerkuilen, PhD
[email protected]
"Out beyond ideas of wrong-doing and right-doing there is a field.
I'll meet you there. When the soul lies down in that grass the world
is too full to talk about." ---Rumi
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/