Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Matsize and Estimation of the Variance Matrix in a Regression


From   Alex MacKay <[email protected]>
To   [email protected]
Subject   Re: st: RE: Matsize and Estimation of the Variance Matrix in a Regression
Date   Wed, 4 Sep 2013 18:00:58 -0500

In isolating the incident, I don't think it has to do with matsize,
but rather an amount of randomness in Stata. This is troubling. Simply
re-running the areg command, I get the results below. Note that the
model degrees of freedom goes from 48 to 0 to 49. I've also observed
as low as 45 and as high as 51 with some additional runs. Fixing the
random seed does not seem to have any impact.

Should I repost this under a new thread name?

The levels are: 141 (week), 73 (retailer_id),  24 (state_id), 25
(product), and 46 (clusterID), for a total of 309. Given that I am
only estimating 7 other coefficients, I think we can reject my earlier
hypothesis.

Alex

- - -

1.
note: 2599.week omitted because of collinearity
note: 597.retailer_id omitted because of collinearity
note: 866.retailer_id omitted because of collinearity
note: 877.retailer_id omitted because of collinearity
note: 9101.retailer_id omitted because of collinearity
note: 54.state_id omitted because of collinearity
note: 3997.retailer_id omitted because of collinearity
note: 4955.retailer_id omitted because of collinearity
note: 7005.retailer_id omitted because of collinearity
note: 7599.retailer_id omitted because of collinearity


Linear regression, absorbing indicators           Number of obs   =        597
                                                  F(  48,     45) =          .
                                                  Prob > F        =          .
                                                  R-squared       =     0.9256
                                                  Adj R-squared   =     0.8695
                                                  Root MSE        =     0.3082

                         (Std. Err. adjusted for 46 clusters in clusterID)
---------------------------------------------------------------------------------
                |               Robust
       ln_price |      Coef.   Std. Err.      t    P>|t|     [95%
Conf. Interval]
----------------+----------------------------------------------------------------
            treatment |  -4.044072   3.152507    -1.28   0.206
-10.39355    2.305404
     postperiod |  -.5653387   .3338128    -1.69   0.097    -1.237672
  .1069948
 treatmentXpostperiod |  -.0178175   .1210774    -0.15   0.884
-.2616798    .2260448


2.

note: 2599.week omitted because of collinearity
note: 597.retailer_id omitted because of collinearity
note: 866.retailer_id omitted because of collinearity
note: 877.retailer_id omitted because of collinearity
note: 9101.retailer_id omitted because of collinearity
note: 54.state_id omitted because of collinearity
Warning:  variance matrix is nonsymmetric or highly singular
note: 3997.retailer_id omitted because of collinearity
note: 4955.retailer_id omitted because of collinearity
note: 7005.retailer_id omitted because of collinearity
note: 7599.retailer_id omitted because of collinearity

 Linear regression, absorbing indicators           Number of obs   =        597
                                                  F(   0,     45) =          .
                                                  Prob > F        =          .
                                                  R-squared       =     0.9256
                                                  Adj R-squared   =     0.8695
                                                  Root MSE        =     0.2950

                         (Std. Err. adjusted for 46 clusters in clusterID)
---------------------------------------------------------------------------------
                |               Robust
       ln_price |      Coef.   Std. Err.      t    P>|t|     [95%
Conf. Interval]
----------------+----------------------------------------------------------------
            treatment |  -4.044072          .        .       .
   .           .
     postperiod |  -.5653387          .        .       .            .
         .
 treatmentXpostperiod |  -.0178175          .        .       .
   .           .


3.

note: 2599.week omitted because of collinearity
note: 597.retailer_id omitted because of collinearity
note: 866.retailer_id omitted because of collinearity
note: 877.retailer_id omitted because of collinearity
note: 9101.retailer_id omitted because of collinearity
note: 54.state_id omitted because of collinearity
note: 3997.retailer_id omitted because of collinearity
note: 4955.retailer_id omitted because of collinearity
note: 7005.retailer_id omitted because of collinearity
note: 7599.retailer_id omitted because of collinearity

Linear regression, absorbing indicators           Number of obs   =        597
                                                  F(  49,     45) =          .
                                                  Prob > F        =          .
                                                  R-squared       =     0.9256
                                                  Adj R-squared   =     0.8695
                                                  Root MSE        =     0.3085

                         (Std. Err. adjusted for 46 clusters in clusterID)
---------------------------------------------------------------------------------
                |               Robust
       ln_price |      Coef.   Std. Err.      t    P>|t|     [95%
Conf. Interval]
----------------+----------------------------------------------------------------
            treatment |  -4.044072   3.152507    -1.28   0.206
-10.39355    2.305404
     postperiod |  -.5653387   .3338128    -1.69   0.097    -1.237672
  .1069948
 treatmentXpostperiod |  -.0178175   .1210774    -0.15   0.884
-.2616798    .2260448

On Wed, Sep 4, 2013 at 11:07 AM, Joe Canner <[email protected]> wrote:
> How many levels are in week, retailer_id, state, product, and clusterID?
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Alex MacKay
> Sent: Wednesday, September 04, 2013 12:03 PM
> To: [email protected]
> Subject: Re: st: RE: Matsize and Estimation of the Variance Matrix in a Regression
>
> 1. The full specification is:
>
> areg ln_price treatment postperiod treatmentXpostperiod ln_unemployment ln_population ln_income price_index ///
>      i.week i.retailer_id i.state, absorb(product) vce(cluster clusterID)
>
> 2. The fixed effects variables are stored as integers.
>
> 3. I'm increasing the matsize because I am running several regressions, and for some I run into the issue: "matsize too small." I re-ran all regressions, and for a few (like the one above) that did not have the error, the results changed.
>
> Alex
>
> On Wed, Sep 4, 2013 at 10:24 AM, Joe Canner <[email protected]> wrote:
>> Alex,
>>
>> I'm no -areg- expert, but I would suggestion that if you want get more traction with this question, you should probably provide additional information, including:
>>
>> 1. The complete specification of your model 2. A description of the
>> variables in your model (e.g., if categorical, how many levels) 3. Why
>> you are increasing the -matsize- in the first place
>>
>> I suspect that the model has some intrinsic problems that need to be fixed (perhaps something similar to what you have suggested) which will probably take care of the -matsize- issue (which is probably more of a symptom than a cause), but we would need to know more before offering a solution.
>>
>> Regards,
>> Joe Canner
>> Johns Hopkins University School of Medicine
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Alex MacKay
>> Sent: Wednesday, September 04, 2013 9:58 AM
>> To: [email protected]
>> Subject: st: Matsize and Estimation of the Variance Matrix in a
>> Regression
>>
>> Dear statalist,
>>
>> I have run into an issue that when I increase the matsize, it can
>> cause a regression that previously ran with no warnings to return:
>> "Warning: variance matrix is nonsymmetric or highly singular."
>>
>> It estimates the exact same coefficients across the board. I've put
>> the log for the first coefficient below. Notice the Warning in advance
>> of the output. With the larger matsize (10000), it does not estimate
>> standard errors, and the model degrees of freedom are zero.
>>
>> I am using the areg command to absorb the variable product_id. Is it
>> possible that Stata is trying to generate a number of fixed effects
>> that exceed 800, the original matsize, and decides to drop the
>> product_id dummy variables? This may allow it to estimate standard
>> errors. If so, I think it should be reported as a bug.
>>
>> Alex
>>
>> (Note: I'm reposting in a way that may more clearly identify the
>> issues, now that I am familiar with replying).
>>
>>
>> //Matsize = 10000
>>
>>
>> note: 2599.week omitted because of collinearity
>> note: 597.retailer_id omitted because of collinearity
>> note: 866.retailer_id omitted because of collinearity
>> note: 877.retailer_id omitted because of collinearity
>> note: 9101.retailer_id omitted because of collinearity
>> note: 54.state_id omitted because of collinearity
>> Warning:  variance matrix is nonsymmetric or highly singular
>> note: 3997.retailer_id omitted because of collinearity
>> note: 4955.retailer_id omitted because of collinearity
>> note: 7005.retailer_id omitted because of collinearity
>> note: 7599.retailer_id omitted because of collinearity
>>
>> Linear regression, absorbing indicators           Number of obs   =        597
>>
>>                                                   F(   0,     45) =          .
>>                                                   Prob > F        =          .
>>                                                   R-squared       =     0.9256
>>                                                   Adj R-squared   =     0.8695
>>                                                   Root MSE        =     0.2950
>>
>>                       (Std. Err. adjusted for 46 clusters in
>> clusterID)
>> ------------------------------------------------------------------------------
>>              |               Robust
>>     ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
>> -------------+--------------------------------------------------------
>> -------------+--------
>>          treatment |  -4.044072          .        .       .
>> .           .
>>
>>
>>
>> //Matsize == 800
>>
>> note: 2599.week omitted because of collinearity
>> note: 597.retailer_id omitted because of collinearity
>> note: 866.retailer_id omitted because of collinearity
>> note: 877.retailer_id omitted because of collinearity
>> note: 9101.retailer_id omitted because of collinearity
>> note: 54.fips omitted because of collinearity
>> note: 3997.retailer_id omitted because of collinearity
>> note: 4955.retailer_id omitted because of collinearity
>> note: 7005.retailer_id omitted because of collinearity
>> note: 7599.retailer_id omitted because of collinearity
>>
>> Linear regression, absorbing indicators           Number of obs   =        597
>>
>>                                                   F(  49,     45) =          .
>>                                                   Prob > F        =          .
>>                                                   R-squared       =     0.9256
>>                                                   Adj R-squared   =     0.8695
>>                                                   Root MSE        =     0.3085
>>
>>                       (Std. Err. adjusted for 46 clusters in
>> clusterID)
>> ------------------------------------------------------------------------------
>>              |               Robust
>>     ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
>> -------------+--------------------------------------------------------
>> -------------+--------
>>          treatment |  -4.044072   3.152507    -1.28   0.206
>> -10.39355    2.305404
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index