Marc Philipp <[email protected]> is trying to reproduce the standard error
calculations produced by the -jackknife:- prefix command:
> I am still trying to understand how the -jackknife:- command computes the
> standard errors of the parameters. I made some progress, but I still have a
> problem that is puzzling me. Actually, I tried to replicate these standard
> errors using the method outlined in Miller (1974), which is based on Tukey
> (1958). According to Stata user guide, this is the method implemented in
> Stata.
>
> However, I don't manage to get the same standard errors. I send you my
> output below, where you can see how I tried to replicate the results. You
> can see that the Jackknifed parameters are exactly the same, but the
> standard errors produced by the -jackknife:- command are smaller than those
> I computed. They should be the same. Am I making a mistake or is Stata
> using another method to compute these standard errors?
> . jackknife _b[x] e(delta), cluster(tt) saving(jack, replace): nbreg y x d*, disp(c) nocons
> (running nbreg on estimation sample)
>
> Jackknife replications (3)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
> ...
>
> Jackknife results Number of obs = 300
> Number of clusters = 3
> Replications = 3
>
> command: nbreg y x d*, disp(c) nocons
> _jk_1: _b[x]
> _jk_2: e(delta)
> n(): e(N)
>
> ------------------------------------------------------------------------------
> | Jackknife
> | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> _jk_1 | 1.013864 .1062226 9.54 0.011 .5568252 1.470903
> _jk_2 | 1.362775 .0625554 21.79 0.002 1.093621 1.631929
> ------------------------------------------------------------------------------
>
> . matrix bet = e(b)
> . matrix list e(b_jk)
>
> e(b_jk)[1,2]
> _jk_1 _jk_2
> y1 1.0350594 2.6554469
>
> . use jack.dta, clear
> (jackknife: nbreg)
>
> . gen beta_i = 3*bet[1,1]-2*_jk_1
> . gen delta_i = 3*bet[1,2]-2*_jk_2
> . su beta_i delta_i
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> beta_i | 3 1.035059 .1839829 .8623033 1.228518
> delta_i | 3 2.655447 .108349 2.582983 2.780004
In the above code, Marc is using -jackknife- to save a dataset with the
jackknife replicates of his statistics of interest, there are only 3
replicates because of clustering. He then uses these replicates to generate
his own 'pseudo' values. Finally he uses -summarize- on the newly generated
variables.
While -summarize- computed the standard deviation of Marc's new variables, the
standard error produced by -jackknife- comes from the standard error of the
mean of the pseudo values. In Marc's example above, the difference is due to
a factor of 'sqrt(1/n)' where 'n' is the number of replicates, n=3 in Marc's
example.
.1062226 = .1839829 * sqrt(1/3)
.0625554 = .108349 * sqrt(1/3)
Marc could use the -ci- command instead of -summarize- to reproduce the
standard error calculations of -jackknife-.
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/