Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Numbers with decimals and -float- command
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Numbers with decimals and -float- command
Date
Thu, 17 Nov 2011 23:40:00 +0000
-round()- is a function, not a command.
It would be nice if you had solved your problem but you are still
playing with fire. -round()- is defined generally, but its use with
non-integers is fraught with dangers.
Nick
On Thu, Nov 17, 2011 at 9:49 PM, Joseph Monte <[email protected]> wrote:
> Thanks, Nick. I think I may have solved my problem by using the
> -round()- command since totsh1, totsh2, etc. have six legitimate
> decimal places (the numbers represent millions).
>
> Specifically,
>
> gen x=1 if round(totsh2,0.000001)>=round(totsh1,0.000001)
>
> Joe
>
>
>
> On Tue, Nov 8, 2011 at 8:38 PM, Nick Cox <[email protected]> wrote:
>> So they're different. One bit is enough!
>>
>> This is all about the fact that most decimal calculations do not have _exact_ binary equivalents.
>>
>> Forgetting about the 2, which is an integer, the difference is between
>>
>> .09 + .235
>>
>> and
>>
>> .325
>>
>> and everyone on this list knows that away from a computer they are equivalent. But Stata can only do the first calculation by converting from decimal to binary first and converting back afterwards
>>
>> . di %21x .325
>> +1.4cccccccccccdX-002
>>
>> . di %21x .09 + .235
>> +1.4ccccccccccccX-002
>>
>> And there is the smallest possible difference other than zero.
>>
>> The main way to deal with this problem is just to ignore it.
>>
>> As you are aware of -float()- you are probably aware of discussions of this point, but here are some references. I'd start with Bill Gould's blog entries.
>>
>> FAQ . . . . . . . . . . . . . . . . . . . Results of the mod(x,y) function
>> . . . . . . . . . . . . . . . . . . . . . N. J. Cox and T. J. Steichen
>> 2/03 Why does the mod(x,y) function sometimes give
>> puzzling results?
>> Why is mod(0.3,0.1) not equal to 0?
>> http://www.stata.com/support/faqs/data/mod.html
>>
>> FAQ . . . . . . . . . . . . . . . . . The accuracy of the float data type
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
>> 5/01 How many significant digits are there in a float?
>> http://www.stata.com/support/faqs/data/prec.html
>>
>> FAQ . . . . . . . . . Comparing floating-point values (the float function)
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Wernow
>> 3/01 Why can't I compare two values that I know are equal?
>> http://www.stata.com/support/faqs/data/float.html
>>
>> Blog . . . . . . . . . . . . . . . . . How to read the %21x format, part 2
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
>> 2/11 http://blog.stata.com/2011/02/10/
>> how-to-read-the-percent-21x-format-part-2/
>>
>> FAQ . . . . . . . . . Why am I losing precision with large whole numbers?
>> . . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
>> 7/08 http://www.ats.ucla.edu/stat/stata/faq/longid.htm
>>
>> SJ-8-2 pr0038 Mata Matters: Overflow, underflow & IEEE floating-point format
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. M. Linhart
>> Q2/08 SJ 8(2):255--268 (no commands)
>> focuses on underflow and overflow and details of how
>> floating-point numbers are stored in the IEEE 754
>> floating-point standard
>>
>> SJ-6-4 pr0025 . . . . . . . . . . . . . . . . . . . Mata matters: Precision
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . W. Gould
>> Q4/06 SJ 6(4):550--560 (no commands)
>> looks at programming implications of the floating-point,
>> base-2 encoding that modern computers use
>>
>> SJ-6-2 dm0022 . Tip 33: Sweet sixteen: Hexadec. formats & precision problems
>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
>> Q2/06 SJ 6(2):282--283 (no commands)
>> tip for using hexadecimal formats to understand precision
>> problems in Stata
>>
>> Nick
>> [email protected]
>>
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Joseph Monte
>> Sent: 08 November 2011 20:17
>> To: [email protected]
>> Subject: Re: st: Numbers with decimals and -float- command
>>
>> Nick,
>>
>> Thanks for the suggestion and sorry for a) lack of clarity and b) the
>> misquote (Jeremy). I ran your suggested code and got the following.
>> Why is totsh1 not equal to totsh2?
>>
>> . assert totsh1 == totsh2 in 1870
>> assertion is false
>> r(9);
>>
>> .
>> . di %21x (totsh1[1870] - totsh2[1870])
>> +1.0000000000000X-016
>>
>>
>> Note:- totsh1 = primsh1 + secsh1. Similarly, totsh2 = primsh2 + secsh2
>>
>> . list primsh1 secsh1 totsh1 primsh2 secsh2 totsh2 in 1870
>>
>> +-------------------------------------------------------+
>> | primsh1 secsh1 totsh1 primsh2 secsh2 totsh2 |
>> |-------------------------------------------------------|
>> 1870. | 2 .325 2.325 2.09 .235 2.325 |
>> +-------------------------------------------------------+
>>
>> As a further test, I recreated totsh1 & totsh2 naming them x and y
>> respectively. But, the results are the same.
>>
>> . gen x=primsh1+secsh1
>> (4268 missing values generated)
>>
>> . gen y=primsh2+secsh2
>> (4268 missing values generated)
>>
>> . list x y in 1870
>>
>> +---------------+
>> | x y |
>> |---------------|
>> 1870. | 2.325 2.325 |
>> +---------------+
>>
>> . assert x == y in 1870
>> assertion is false
>> r(9);
>>
>> .
>> .
>> .
>> . di %21x (x[1870] - y[1870])
>> +1.0000000000000X-016
>>
>>
>> Thanks,
>>
>> Joe
>>
>>
>>
>>
>>
>> On Tue, Nov 8, 2011 at 7:26 PM, Nick Cox <[email protected]> wrote:
>>> I found this difficult to follow. It's not clear that we need to
>>> understand "upwards", "downwards" and "sideways", as the key question
>>> is whether or not certain values are equal.
>>>
>>> First note that the FAQ you cite is due to Jeremy Wernow.
>>>
>>> You are puzzled why your -if- condition ignores e.g. observation 1870
>>> in which you have
>>>
>>> 1870. 2.325 2.325 2.525 . . 2.525
>>>
>>> and so on the face of should we be because (e.g.) the first and second
>>> values look identical. But that is just a matter of display format.
>>>
>>> As those variables are of the same type I would just look directly at
>>> those values
>>>
>>> assert totsh1 == totsh2 in 1870
>>> di %21x (totsh1[1870] - totsh2[1870])
>>>
>>> I don't see that the -float()- function will be illuminating here.
>>>
>>> Nick
>>>
>>> On Mon, Nov 7, 2011 at 7:47 PM, Joseph Monte <[email protected]> wrote:
>>>
>>>> The output below should contain only observations where there are both
>>>> upwards and downwards (or vice versa) movements in "totsh" (from
>>>> "totsh1" through "totsho~r"). Sideways movements are allowed. As an
>>>> example, obs 1157 has a downward movement from "totsh1" to "totsh2"
>>>> then upward to "totsh3" and then sideways to "totsho~r", which is
>>>> fine. "obs" is the number of "totsh*" observations in each row.
>>>>
>>>> However, notice observations 1870 (where the path is upward and NOT
>>>> downward) & 3275 (where the path is downward and NOT upward). These
>>>> observations should not be in type 3 but in type 1 (which captures
>>>> upward (and sideways) movements) and type 2 (which captures downward
>>>> (and sideways) movements respectively) - these are not shown for
>>>> brevity. On further investigation, I expected the issue to be resolved
>>>> if I used the -float- command from Nick's FAQ.
>>>>
>>>> http://www.stata.com/support/faqs/data/float.html
>>>>
>>>> However, as shown below, the -float- command does not seem to solve
>>>> the problem. In obs 1870, totsh1 & totsh2 do not seem to be equal even
>>>> though both are 2.325. Same issue for obs 3275.
>>>>
>>>>
>>>> . list totsh1 totsh2 totsh3 totsh4 totsh5 totshoffer obs type if type==3
>>>>
>>>> +----------------------------------------------------------------------------+
>>>> | totsh1 totsh2 totsh3 totsh4 totsh5 totsho~r
>>>> obs type |
>>>> |----------------------------------------------------------------------------|
>>>> 1157. | 3.5 3.483289 3.5 . . 3.5
>>>> 4 3 |
>>>> 1362. | 1.615159 1.588584 . . . 2
>>>> 3 3 |
>>>> 1543. | 1.5 2 . . . 1.75
>>>> 3 3 |
>>>> 1691. | 20 25 21 15 . 15
>>>> 5 3 |
>>>> 1762. | 1.75 1.9 . . . 1.865
>>>> 3 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 1764. | 1.785918 1.68277 . . . 2.4
>>>> 3 3 |
>>>> 1768. | 2.25 2 . . . 2.666667
>>>> 3 3 |
>>>> 1771. | 2.5 2.5 3 . . 2.9
>>>> 4 3 |
>>>> 1774. | 5.5 4 4.7 4.65 4.65 4.65
>>>> 6 3 |
>>>> 1870. | 2.325 2.325 2.525 . . 2.525
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 2115. | 2.475 2.14 2.4 . . 2.4
>>>> 4 3 |
>>>> 2256. | 2.1 1.85 . . . 2.1
>>>> 3 3 |
>>>> 2514. | 2.5 2.75 . . . 2.4
>>>> 3 3 |
>>>> 2524. | 4 2.7 2.2 2 . 2.2
>>>> 5 3 |
>>>> 2598. | 2.5 2 2.35 . . 2.5
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 2606. | 3.7 2.75 . . . 2.85
>>>> 3 3 |
>>>> 2645. | 3.4 2.3 3.3 . . 3
>>>> 4 3 |
>>>> 2657. | 2.3 2.5 2.1 1.65 . 1.65
>>>> 5 3 |
>>>> 2719. | 2.5 2.949862 . . . 2.5
>>>> 3 3 |
>>>> 2737. | 2 1.5 . . . 1.7
>>>> 3 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 2760. | 1 1.2 .9 . . .9
>>>> 4 3 |
>>>> 2782. | 2.25 2 . . . 2.5
>>>> 3 3 |
>>>> 2838. | 5.883 4 . . . 4.8
>>>> 3 3 |
>>>> 2912. | 2 2.455 1.8 . . 1.8
>>>> 4 3 |
>>>> 2962. | 1.15 1 1.05 . . 1.05
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 2980. | 2.7 2 2.3 . . 2.3
>>>> 4 3 |
>>>> 2987. | 2 1.4 1.6 . . 1.92
>>>> 4 3 |
>>>> 3027. | 5.45 5.55 5.65 5.553 . 5.553
>>>> 5 3 |
>>>> 3096. | 1.8 1.85 1.25 1.35 . 1.35
>>>> 5 3 |
>>>> 3132. | 1.5 1 1.25 . . 1.25
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 3188. | 2.3 2.7 . . . 2.3
>>>> 3 3 |
>>>> 3251. | 17.2 6 . . . 7
>>>> 3 3 |
>>>> 3275. | 6.8 6.8 5 . . 5
>>>> 4 3 |
>>>> 3286. | 1.8 1.4 1.5 . . 1.5
>>>> 4 3 |
>>>> 3306. | 6 4 5 . . 5
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 3488. | 2.5 2.2 2.5 . . 2.5
>>>> 4 3 |
>>>> 3519. | 16.25 13.25 13.3 14.9 . 14.9
>>>> 5 3 |
>>>> 3566. | 12.575 10.5 5 4.0625 . 4.665
>>>> 5 3 |
>>>> 3667. | 3.5 4 3.6 . . 3.6
>>>> 4 3 |
>>>> 3877. | 6.25 5.5 6.5 . . 6.5
>>>> 4 3 |
>>>> |----------------------------------------------------------------------------|
>>>> 3919. | 8 11.5 8.5 . . 8.5
>>>> 4 3 |
>>>> 3944. | 7.5 4.7 . . . 5
>>>> 3 3 |
>>>> 3954. | 6 5 . . . 6.44
>>>> 3 3 |
>>>> 4002. | 10.3 14.6 10 . . 10
>>>> 4 3 |
>>>> 4014. | 5 4.95 5.030305 5.045972 . 5.295972
>>>> 5 3 |
>>>> +----------------------------------------------------------------------------+
>>>>
>>>> . list totsh1 totsh2 type if float(totsh1)==float(totsh2) & totsh1!=. & type==3
>>>>
>>>> +------------------------+
>>>> | totsh1 totsh2 type |
>>>> |------------------------|
>>>> 1771. | 2.5 2.5 3 |
>>>> +------------------------+
>>>>
>>>> . des totsh1 totsh2
>>>>
>>>> storage display value
>>>> variable name type format label variable label
>>>> ---------------------------------------------------------------------------------------------------------------------------------------------
>>>> totsh1 float %9.0g
>>>> totsh2 float %9.0g
>>>>
>>>>
>>>> I would appreciate any help on the issue. I am using Stata 12.
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/