[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: -nl- command bug?

From	"FEIVESON, ALAN H. (AL) (JSC-SK) (NASA)" <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: RE: -nl- command bug?
Date	Wed, 18 Jun 2003 13:37:45 -0500
Paulo - The difference is that for nlpark1, the R2 is calculated as if there
were no constant ,so the model sum of squares is raw, not centered. This
inflates the "R^2". In the second case, where there are linear terms, -nl-
apparently calculates as if there is a constant term. This is analogous to
fitting linear regression models one with a constant, the other without. For
example the following 2 regressions give almost identical fits (root MSE =
1.84), yet the "R2"'s are different because in the second case, the constant
term is assumed known ( = 6).

. clear

. range x 0 1 101
obs was 0, now 101

. gen y=6+10*x+2*invnorm(uniform())

. reg y x

      Source |       SS       df       MS              Number of obs =
101
-------------+------------------------------           F(  1,    99) =
226.42
       Model |  770.936359     1  770.936359           Prob > F      =
0.0000
    Residual |  337.078803    99   3.4048364           R-squared     =
0.6958
-------------+------------------------------           Adj R-squared =
0.6927
       Total |  1108.01516   100  11.0801516           Root MSE      =
1.8452

----------------------------------------------------------------------------
--
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
           x |   9.476307   .6297642    15.05   0.000     8.226718
10.7259
       _cons |   6.251979   .3645024    17.15   0.000     5.528728
6.975231
----------------------------------------------------------------------------
--

. gen ym6=y-6

. reg ym6 x,noc

      Source |       SS       df       MS              Number of obs =
101
-------------+------------------------------           F(  1,   100) =
969.68
       Model |  3284.35327     1  3284.35327           Prob > F      =
0.0000
    Residual |  338.705947   100  3.38705947           R-squared     =
0.9065
-------------+------------------------------           Adj R-squared =
0.9056
       Total |  3623.05921   101  35.8718734           Root MSE      =
1.8404

----------------------------------------------------------------------------
--
         ym6 |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
           x |   9.852396   .3163941    31.14   0.000     9.224679
10.48011
----------------------------------------------------------------------------
--

In the case of -nl-, it's not obvious to me how -nl- decides what to use for
the "constant" term in it's output. I've wondered about this myself.

Al Feiveson



-----Original Message-----
From: Paulo Guimaraes [mailto:[email protected]]
Sent: Wednesday, June 18, 2003 10:29 AM
To: [email protected]
Subject: st: -nl- command bug?


Dear All,

While using the -nl- command I have come across a
rather puzzling result. Two different specifications
that fit the data equally well (the correlation from
their predicted values is 1) show an r-squared of
0.1 and 0.7 respectively.
It seems that in one of the models stata is
incorrectly calculating the sum of squares from the
model (I could replicate the value for one of the
models but not for the other).
Any hint to what is happening? It seems to me like a
bug but I may be overlooking something. I also dont
understand why the degrees of freedom is 3 in one
model and 4 in the other.
Here is the code and results:
**************************************************
. program define nlpark1
  1.         version 7
  2.         if  "`1'"=="?" {
  3.         global S_1 "b2 b3 b4 b5"
  4.         global b2=0
  5.         global b3=.01
  6.         global b4=0
  7.         global b5=10
  8.         exit
  9.         }
 10.         replace `1'
=exp($b2)*exp($b3*time)+exp($b4)*exp(-$b5*time)
 11. end

.
. program define nlpark2
  1.         version 7
  2.         if  "`1'"=="?" {
  3.         global S_1 "b2 b3 b4 b5"
  4.         global b2=0
  5.         global b3=.01
  6.         global b4=0
  7.         global b5=10
  8.         exit
  9.         }
 10.         replace `1'
=$b2+$b3*time+exp($b4)*exp(-$b5*time)
 11. end

.
.
. use final

. drop if  total==.|total==0
(10 observations deleted)

. gen time=days0/365.25

. drop if time==.
(0 observations deleted)

. sum total if days0==0

    Variable |     Obs        Mean   Std. Dev.
Min        Max
-------------+-----------------------------------------------------
       total |     150    31.09333   12.82777
6.5         73

. local avdep=r(mean)

. local startv=log(`avdep'/2)

. nl park1 total, eps(1e-9) init(b2=
`startv',b4=`startv')
(obs = 1576)

Iteration 0:  residual SS =  204943.1
...
Iteration 6:  residual SS =  196084.1

      Source |       SS       df       MS
Number of obs =      1576
-------------+------------------------------
F(  4,  1572) =   1342.48
       Model |  669821.354     4  167455.339
Prob > F      =    0.0000
    Residual |  196084.146  1572  124.735462
R-squared     =    0.7736
-------------+------------------------------
Adj R-squared =    0.7730
       Total |    865905.5  1576  549.432424
Root MSE      =   11.1685

Res. dev.     =  12074.57
(park1)
----------------------------------------------------------------------------
--
       total |      Coef.   Std. Err.      t    P>|t|
   [95% Conf. Interval]
-------------+--------------------------------------------------------------
--
          b2 |   2.805342   .0511239    54.87   0.000
   2.705063     2.90562
          b3 |   .0977053   .0379304     2.58   0.010
   .0233058    .1721048
          b4 |   2.673872   .0815743    32.78   0.000
   2.513866    2.833877
          b5 |   10.69553   2.007598     5.33   0.000
   6.757673    14.63338
----------------------------------------------------------------------------
--
 (SE's, P values, CI's, and correlations are
asymptotic approximations)

. predict tm1
(option yhat assumed; fitted values)

. nl park2 total, eps(1e-9) init(b2=
`startv',b4=`startv')
(obs = 1576)

Iteration 0:  residual SS =  547186.4
...
Iteration 5:  residual SS =  196080.5

      Source |       SS       df       MS
Number of obs =      1576
-------------+------------------------------
F(  3,  1572) =     60.60
       Model |  22675.1691     3  7558.38969
Prob > F      =    0.0000
    Residual |  196080.524  1572  124.733158
R-squared     =    0.1037
-------------+------------------------------
Adj R-squared =    0.1019
       Total |  218755.693  1575  138.892503
Root MSE      =   11.1684

Res. dev.     =  12074.54
(park2)
----------------------------------------------------------------------------
--
       total |      Coef.   Std. Err.      t    P>|t|
   [95% Conf. Interval]
-------------+--------------------------------------------------------------
--
          b2 |   16.42274   .9216482    17.82   0.000
   14.61495    18.23053
          b3 |   1.824629   .7069261     2.58   0.010
    .438012    3.211246
          b4 |   2.681091   .0842907    31.81   0.000
   2.515757    2.846425
          b5 |    10.5638   2.023805     5.22   0.000
   6.594161    14.53344
----------------------------------------------------------------------------
--
* Parameter b2 taken as constant term in model & ANOVA
table
 (SE's, P values, CI's, and correlations are
asymptotic approximations)

. predict tm2
(option yhat assumed; fitted values)

. corr tm1 tm2
(obs=1576)

             |      tm1      tm2
-------------+------------------
         tm1 |   1.0000
         tm2 |   1.0000   1.0000

**************************************************
thanks

Paulo Guimaraes

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: st: Pseudo R2 for XTGEE estimates
Next by Date: st: RE: [Cox model]
Previous by thread: st: Pseudo R2 for XTGEE estimates
Next by thread: st: RE: [Cox model]
Index(es):
- Date
- Thread