Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Controling precision for multiple runs of same code (out of office until12th June)
From
"Seyi Soremekun" <[email protected]>
To
"statalist" <[email protected]>
Subject
Re: st: Controling precision for multiple runs of same code (out of office until12th June)
Date
Tue, 04 Jun 2013 00:56:50 +0100
I am currently out of the office until the 12st June with limited email contact.
Please contact Angela Vega ([email protected]) for any enquiries.
>>> Melanie Leis <[email protected]> 06/04/13 00:55 >>>
Hello,
I'm having trouble with a section of my code that yields different
results each time I run it.
I start out with a dataset, baseline_4.dta, which has 47,267,047
observations and 16 variables, and run this:
merge m:m statefips agecat_census using "ABCD.dta"
assert _merge==3
drop _merge
egen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcat
racecat iprcat_mpact iprcat coverage groupsize)
checkpop
rename pop oldpop
gen pop=tot_pop*prob_agecat_mpact
checkpop
collapse (sum) pop, by(statefips countyfips agecat_mpact sexcat
racecat iprcat_mpact iprcat coverage groupsize)
checkpop
sum
sort _all
save "baseline_5.dta", replace
checkpop is a program that tells me what my total population is each
time I run it. My total population is the same before and after the
collapse function (see results below).
At the end, my total population and my number of observations in
baseline_5.dta is different every time I run this. I suspect the
difference is in rounding when it executes the gen pop line, but I've
tried replacing it for
gen double pop=tot_pop*prob_agecat_mpact
and
gen float pop=tot_pop*prob_agecat_mpact
And I still get differences.
I tried using
gen long pop=tot_pop*prob_acegat_mpact
But I lost too much precision by doing this.
Could you please recommend a solution to obtain the exact same numbers
in each run, without sacrificing precision?
Thanks!
Melanie
The log file for 2 of the runs I've done:
************* RUN A ***********************
. merge m:m statefips agecat_census using "ABCD.dta"
Result # of obs.
-----------------------------------------
not matched 0
matched 47,267,047 (_merge==3)
-----------------------------------------
. assert _merge==3
. drop _merge
. egen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcat
racecat iprcat_mpac
> t iprcat coverage groupsize)
. checkpop
Total pop: 347,095,179
Observations: 47,267,047
Missing: 0
. rename pop oldpop
. gen pop=tot_pop*prob_agecat_mpact
. checkpop
Total pop: 332,455,972
Observations: 47,267,047
Missing: 0
. collapse (sum) pop, by(statefips countyfips agecat_mpact sexcat
racecat iprcat_mpact ip
> rcat coverage groupsize)
. checkpop
Total pop: 332,455,972
Observations: 36,351,520
Missing: 0
************** RUN B *************
. merge m:m statefips agecat_census using "ABCD.dta"
Result # of obs.
-----------------------------------------
not matched 0
matched 47,267,047 (_merge==3)
-----------------------------------------
. assert _merge==3
. drop _merge
. egen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcat
racecat iprcat_mpac
> t iprcat coverage groupsize)
. checkpop
Total pop: 347,095,179
Observations: 47,267,047
Missing: 0
. rename pop oldpop
. gen pop=tot_pop*prob_agecat_mpact
. checkpop
Total pop: 332,455,928
Observations: 47,267,047
Missing: 0
. collapse (sum) pop, by(statefips countyfips agecat_mpact sexcat
racecat iprcat_mpact ip
> rcat coverage groupsize)
. checkpop
Total pop: 332,455,928
Observations: 36,351,515
Missing: 0
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/