Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Replicability and -imputw-
From
Roberto Ferrer <[email protected]>
To
Stata Help <[email protected]>
Subject
Re: st: Replicability and -imputw-
Date
Mon, 26 Aug 2013 00:07:30 +0100
Richard,
Thank you for your reply. I just posted my solution. I remember
reading that adding -stable- could in some cases obscure other
problems. I think in this case it was safe, but I thought it would
require more computation time (not exactly sure about this, though).
Now that you mention it, I also find interesting that the seed that
was set just before the -sort- doesn't affect it. Maybe someone can
comment on that.
Thank you.
Bests,
Roberto
On Mon, Aug 26, 2013 at 12:54 AM, Richard Williams
<[email protected]> wrote:
> I would suggest adding the -stable- option to sort. Or (possibly better)
> have the data sorted before you start calling the program. The latter would
> be a little more efficient in terms of computing time, plus there was some
> sort of thread way back when saying sorting was better if you didn't use the
> stable option (although I don't remember why).
>
> According to the help for sort, "Without the stable option, the ordering of
> observations with equal values of varlist is randomized." I just ran a quick
> quick, and as far as I can tell setting the seed does not cause the same
> random order to occur across multiple calls (which strikes me as odd, but
> maybe there is a reason for it). So, I think sorting the data first or using
> the stable option will give you what you want. Please let us know one way or
> the other.
>
>
> At 02:02 PM 8/25/2013, Roberto Ferrer wrote:
>>
>> Hello,
>>
>> I've been using a user-written command -imputw- downloaded from
>>
>> http://fdz.iab.de/187/section.aspx/Publikation/k050719a04
>> Based on Gartner, Herman. "The Imputation of Wages Above the Contribution
>> Limit with the German IAB Employment Sample." FDZ, 2005.
>>
>> My problem is with replicability. I use -set seed- to control for the
>> randomness introduced by the command but I can't manage to obtain the
>> same results for the output variable -lnw_i-. Can anyone please point
>> to source of "uncontrolled randomness" that is affecting the results
>> by inspecting the code?
>>
>> I've double checked, using -cf-, that the data going in is the same
>> for the replication runs. The results for the regressions are the same
>> for all runs (I've checked the log files in a bash terminal (linux)
>> using the program "diff" and they are identical except for log times).
>> But the final resulting variable is not the same for any two runs.
>>
>> I copy the source below since it's not very long and the code snippet
>> I'm running.
>>
>> Thank you.
>>
>> * --------------------- User-written command
>> -------------------------------------
>> program define imputw, byable(recall)
>>
>> version 8
>> syntax varlist [if] , Cens(varlist) Grenze(varlist) [Outvar(string asis)]
>>
>> marksample touse
>> * If no name given to the output, call it by default "lnw_i".
>> if "`outvar'" == "" {
>> local outvar "lnw_i"
>> }
>> * Estimate Tobit model
>> cnreg `varlist' if `touse', censored(`cens')
>> quietly {
>> * Make predictions
>> predict xb00 if `touse' , xb
>> * Generate standardized limit for each value
>> gen alpha00=(ln(`grenze')-xb00)/_b[_se] if `touse'
>> }
>>
>> cap gen `outvar'=.
>> replace `outvar'=`1' if `touse'
>> * Imputation
>> replace `outvar'=xb00+_b[_se] *
>> invnorm(uniform()*(1-norm(alpha00))+norm(alpha00)) if `touse' &
>> `cens'
>>
>> drop xb00 alpha00
>> end
>>
>> * ------------------- Code I'm using -----------------------------------
>> set seed 391829 // -imputw- uses random number generator
>> sort yearobs size_b
>> by yearobs size_b: imputw lwage frau gebjahr bild esector, cens(censored)
>> ///
>> grenze(uplimit)
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> HOME: (574)289-5227
> EMAIL: [email protected]
> WWW: http://www.nd.edu/~rwilliam
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/