Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Roberto Ferrer <refp16@gmail.com> |
To | Stata Help <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Replicability and -imputw- |
Date | Mon, 26 Aug 2013 00:07:30 +0100 |
Richard, Thank you for your reply. I just posted my solution. I remember reading that adding -stable- could in some cases obscure other problems. I think in this case it was safe, but I thought it would require more computation time (not exactly sure about this, though). Now that you mention it, I also find interesting that the seed that was set just before the -sort- doesn't affect it. Maybe someone can comment on that. Thank you. Bests, Roberto On Mon, Aug 26, 2013 at 12:54 AM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > I would suggest adding the -stable- option to sort. Or (possibly better) > have the data sorted before you start calling the program. The latter would > be a little more efficient in terms of computing time, plus there was some > sort of thread way back when saying sorting was better if you didn't use the > stable option (although I don't remember why). > > According to the help for sort, "Without the stable option, the ordering of > observations with equal values of varlist is randomized." I just ran a quick > quick, and as far as I can tell setting the seed does not cause the same > random order to occur across multiple calls (which strikes me as odd, but > maybe there is a reason for it). So, I think sorting the data first or using > the stable option will give you what you want. Please let us know one way or > the other. > > > At 02:02 PM 8/25/2013, Roberto Ferrer wrote: >> >> Hello, >> >> I've been using a user-written command -imputw- downloaded from >> >> http://fdz.iab.de/187/section.aspx/Publikation/k050719a04 >> Based on Gartner, Herman. "The Imputation of Wages Above the Contribution >> Limit with the German IAB Employment Sample." FDZ, 2005. >> >> My problem is with replicability. I use -set seed- to control for the >> randomness introduced by the command but I can't manage to obtain the >> same results for the output variable -lnw_i-. Can anyone please point >> to source of "uncontrolled randomness" that is affecting the results >> by inspecting the code? >> >> I've double checked, using -cf-, that the data going in is the same >> for the replication runs. The results for the regressions are the same >> for all runs (I've checked the log files in a bash terminal (linux) >> using the program "diff" and they are identical except for log times). >> But the final resulting variable is not the same for any two runs. >> >> I copy the source below since it's not very long and the code snippet >> I'm running. >> >> Thank you. >> >> * --------------------- User-written command >> ------------------------------------- >> program define imputw, byable(recall) >> >> version 8 >> syntax varlist [if] , Cens(varlist) Grenze(varlist) [Outvar(string asis)] >> >> marksample touse >> * If no name given to the output, call it by default "lnw_i". >> if "`outvar'" == "" { >> local outvar "lnw_i" >> } >> * Estimate Tobit model >> cnreg `varlist' if `touse', censored(`cens') >> quietly { >> * Make predictions >> predict xb00 if `touse' , xb >> * Generate standardized limit for each value >> gen alpha00=(ln(`grenze')-xb00)/_b[_se] if `touse' >> } >> >> cap gen `outvar'=. >> replace `outvar'=`1' if `touse' >> * Imputation >> replace `outvar'=xb00+_b[_se] * >> invnorm(uniform()*(1-norm(alpha00))+norm(alpha00)) if `touse' & >> `cens' >> >> drop xb00 alpha00 >> end >> >> * ------------------- Code I'm using ----------------------------------- >> set seed 391829 // -imputw- uses random number generator >> sort yearobs size_b >> by yearobs size_b: imputw lwage frau gebjahr bild esector, cens(censored) >> /// >> grenze(uplimit) >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > > ------------------------------------------- > Richard Williams, Notre Dame Dept of Sociology > OFFICE: (574)631-6668, (574)631-6463 > HOME: (574)289-5227 > EMAIL: Richard.A.Williams.5@ND.Edu > WWW: http://www.nd.edu/~rwilliam > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/