[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: precision and ID numbers

From	Stephen Mennemeyer <[email protected]>
To	[email protected]
Subject	st: precision and ID numbers
Date	Wed, 15 Jan 2003 17:47:12 -0600

The Stata 7 Manual U 16.10 briefly mentions that some problems can arise due to the fact that STATA stores numbers in single precision and estimates them in double precision.

I have found another situation where double precision is required in STATA for some numbers that would seem to be safely manipulated with single precisions.

Consider a typical -- hypothetical --Social Security number stored as a string variable.
123-45-6789

On might wish to convert this to a numeric variable for easier manipulation such as doing sorts in a more appealing manner.

If one parses the SSN into its numeric components, multiplies them up to the appropriate scale and then adds them back together, the result is a bit surprising. This process is better done in double precsion to get the expected result.

list ssnchar /* from an external file of hypothtical SSNs*/

ssnchar
1. 123-45-6789
2. 987-65-4321
3. 078-94-5612
4. 321-65-7894
5. 978-54-6231

/* parse the string variable ssnchar into its component parts, multiply them up to the appropriate position in the future number and then add the parts */
. gen double p1=real(substr(ssnchar,1,3))
. gen double p2=real(substr(ssnchar,5,2))
. gen double p3=real(substr(ssnchar,8,4))
. gen double ssndbl=p1*1000000+p2*10000+p3
. format ssndbl %9.0f

/* with double precision the results are as expected */
. list
ssnchar p1 p2 p3 ssndbl
1. 123-45-6789 123 45 6789 123456789
2. 987-65-4321 987 65 4321 987654321
3. 078-94-5612 78 94 5612 78945612
4. 321-65-7894 321 65 7894 321657894
5. 978-54-6231 978 54 6231 978546231

/* if we use the float form of the number , the resulting variable ssnf is not what might be anticipated */
. gen p1f=real(substr(ssnchar,1,3))
. gen p2f=real(substr(ssnchar,5,2))
. gen p3f=real(substr(ssnchar,8,4))
. gen ssnf=p1f*1000000+p2f*10000+p3f

. format ssnf %9.0f
. list
. list ssn*
ssnchar ssndbl ssnf
1. 123-45-6789 123456789 123456792
2. 987-65-4321 987654321 987654336
3. 078-94-5612 78945612 78945616
4. 321-65-7894 321657894 321657888
5. 978-54-6231 978546231 978546240

--
Stephen T. Mennemeyer Ph.D.
University of Alabama at Birmingham
School of Public Health
Dept. of Health Care Organization and Policy

U.S. Mail:
1530 3rd Ave. South 330 RPHB
Birmingham, Al 35294-0022

Express Delivery:
330 Ryals Public Health Building
1665 University Blvd.
Birmingham, Al 35294-0022

Phone: (205) 975-8965
FAX (205) 934-3347
e-mail: [email protected]

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: precision and ID numbers
  - From: "Nick Cox" <[email protected]>

Prev by Date: Re: st: RE: Stata 8.0SE startup - no review or variable windows?
Next by Date: st: Re: Stata 8.0SE startup - no review or variable windows?
Previous by thread: st: List is slow
Next by thread: st: RE: precision and ID numbers
Index(es):
- Date
- Thread