Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Replicability and -imputw-

From	Richard Williams <[email protected]>
To	[email protected], Stata Help <[email protected]>
Subject	Re: st: Replicability and -imputw-
Date	Sun, 25 Aug 2013 18:54:05 -0500

I would suggest adding the -stable- option to sort. Or (possiblybetter) have the data sorted before you start calling the program.The latter would be a little more efficient in terms of computingtime, plus there was some sort of thread way back when saying sortingwas better if you didn't use the stable option (although I don't remember why).

According to the help for sort, "Without the stable option, theordering of observations with equal values of varlist is randomized."I just ran a quick quick, and as far as I can tell setting the seeddoes not cause the same random order to occur across multiple calls(which strikes me as odd, but maybe there is a reason for it). So, Ithink sorting the data first or using the stable option will give youwhat you want. Please let us know one way or the other.


At 02:02 PM 8/25/2013, Roberto Ferrer wrote:

Hello,

I've been using a user-written command -imputw- downloaded from

http://fdz.iab.de/187/section.aspx/Publikation/k050719a04
Based on Gartner, Herman. "The Imputation of Wages Above the Contribution
Limit with the German IAB Employment Sample." FDZ, 2005.

My problem is with replicability. I use -set seed- to control for the
randomness introduced by the command but I can't manage to obtain the
same results for the output variable -lnw_i-. Can anyone please point
to source of "uncontrolled randomness" that is affecting the results
by inspecting the code?

I've double checked, using -cf-, that the data going in is the same
for the replication runs. The results for the regressions are the same
for all runs (I've checked the log files in a bash terminal (linux)
using the program "diff" and they are identical except for log times).
But the final resulting variable is not the same for any two runs.

I copy the source below since it's not very long and the code snippet
I'm running.

Thank you.

* --------------------- User-written command
-------------------------------------
program define imputw, byable(recall)

version 8
syntax varlist [if] , Cens(varlist) Grenze(varlist) [Outvar(string asis)]

    marksample touse
* If no name given to the output, call it by default "lnw_i".
    if "`outvar'" == "" {
local outvar "lnw_i"
    }
* Estimate Tobit model
cnreg `varlist' if `touse', censored(`cens')
quietly {
* Make predictions
predict xb00 if `touse'  , xb
* Generate standardized limit for each value
gen alpha00=(ln(`grenze')-xb00)/_b[_se] if `touse'
    }

cap gen  `outvar'=.
replace `outvar'=`1' if `touse'
* Imputation
replace `outvar'=xb00+_b[_se] *
invnorm(uniform()*(1-norm(alpha00))+norm(alpha00)) if `touse'   &
`cens'

drop xb00 alpha00
end

* ------------------- Code I'm using -----------------------------------
set seed 391829 // -imputw- uses random number generator
sort yearobs size_b
by yearobs size_b: imputw lwage frau gebjahr bild esector, cens(censored) ///
grenze(uplimit)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Replicability and -imputw-
  - From: Roberto Ferrer <[email protected]>

References:
- st: Replicability and -imputw-
  - From: Roberto Ferrer <[email protected]>

Prev by Date: Re: st: approximate quantiles in Stata
Next by Date: Re: st: Replicability and -imputw-
Previous by thread: Re: st: Replicability and -imputw-
Next by thread: Re: st: Replicability and -imputw-
Index(es):
- Date
- Thread