Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Controling precision for multiple runs of same code (out of office until12th June)
From
"Seyi Soremekun" <[email protected]>
To
<[email protected]>
Subject
RE: st: Controling precision for multiple runs of same code (out of office until12th June)
Date
Tue, 04 Jun 2013 01:05:03 +0100
I am currently out of the office until the 12st June with limited email contact.
Please contact Angela Vega ([email protected]) for any enquiries.
>>> "[email protected]" <[email protected]> 06/04/13 01:03 >>>
Without going into much detail, be aware that a many-to-many merge can yield
non-deterministic (and often meaningless) pairings of observations, leading to irreproducable or inconsistent results.
Sent with Verizon Mobile Email
---Original Message---
From: [email protected]
Sent: 6/3/2013 7:56 pm
To: "statalist" <[email protected]>
Subject: st: Controling precision for multiple runs of same code
Hello,I'm having trouble with a section of my code that yields differentresults each time I run it.I start out with a dataset, baseline_4.dta, which has 47,267,047observations and 16 variables, and run this:merge m:m statefips agecat_census using "ABCD.dta"assert _merge==3drop _mergeegen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcatracecat iprcat_mpact iprcat coverage groupsize)checkpoprename pop oldpopgen pop=tot_pop*prob_agecat_mpactcheckpopcollapse (sum) pop, by(statefips countyfips agecat_mpact sexcatracecat iprcat_mpact iprcat coverage groupsize)checkpopsumsort _allsave "baseline_5.dta", replacecheckpop is a program that tells me what my total population is eachtime I run it. My total population is the same before and after thecollapse function (see results below).At the end, my total population and my number of observations inbaseline_5.dta is different every time I run this. I suspect thedifference is in rounding when it !
executes the gen pop line, but I'vetried replacing it forgen double pop=tot_pop*prob_agecat_mpactandgen float pop=tot_pop*prob_agecat_mpactAnd I still get differences.I tried usinggen long pop=tot_pop*prob_acegat_mpactBut I lost too much precision by doing this.Could you please recommend a solution to obtain the exact same numbersin each run, without sacrificing precision?Thanks!MelanieThe log file for 2 of the runs I've done:************* RUN A ***********************. merge m:m statefips agecat_census using "ABCD.dta" Result # of obs. ----------------------------------------- not matched 0 matched 47,267,047 (_merge==3) -----------------------------------------. assert _merge==3. drop _merge. egen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcatracecat iprcat_mpac> t iprcat coverage groupsize). checkpopTotal pop: 34!
7,095,179Observations: 47,267,047Missing: 0.
rename pop oldpop. gen pop=tot_pop*prob_agecat_mpact. checkpopTotal pop: 332,455,972Observations: 47,267,047Missing: 0. collapse (sum) pop, by(statefips countyfips agecat_mpact sexcatracecat iprcat_mpact ip> rcat coverage groupsize). checkpopTotal pop: 332,455,972Observations: 36,351,520Missing: 0************** RUN B *************. merge m:m statefips agecat_census using "ABCD.dta" Result # of obs. ----------------------------------------- not matched 0 matched 47,267,047 (_merge==3) -----------------------------------------. assert _merge==3. drop _merge. egen tot_pop=sum(pop), by(statefips countyfips agecat_census sexcatracecat iprcat_mpac> t iprcat coverage groupsize). checkpopTotal pop: 347,095,179Observations: 47,267,047Missing: 0. rename pop oldpop. !
gen pop=tot_pop*prob_agecat_mpact. checkpopTotal pop: 332,455,928Observations: 47,267,047Missing: 0. collapse (sum) pop, by(statefips countyfips agecat_mpact sexcatracecat iprcat_mpact ip> rcat coverage groupsize). checkpopTotal pop: 332,455,928Observations: 36,351,515Missing: 0** For searches and help try:* http://www.stata.com/help.cgi?search* http://www.stata.com/support/faqs/resources/statalist-faq/* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/