Best floating-point questions in November 2010

17 votes

[Python 3.1]

I encountered negative zero in output from python; it's created for example as follows:

k = 0.0
print(-k)

The output will be -0.0.

However, when I compare the -k to 0.0 for equality, it yields True. Is there any difference between 0.0 and -0.0 (I don't care that they presumably have different internal representation; I only care about their behavior in a program.) Is there any hidden traps I should be aware of?

Check out : −0 (number) in Wikipedia

Basically IEEE does actually define a negative zero

And by this definition for all purposes :

-0.0 == +0.0 == 0

I agree with aaronasterling that -0.0 and +0.0 are different objects. Making them equal (equality operator) makes sure that subtle bugs are not introduced in the code. Think of a * b == c * d

>>> a = 3.4
>>> b =4.4
>>> c = -0.0
>>> d = +0.0
>>> a*c
-0.0
>>> b*d
0.0
>>> a*c == b*d
True
>>> 

[Edit: More info based on comments]

When i said for all practical purposes, I had chosen the word rather hastily. I meant standard equality comparison.

I would add more information and references in this regard:

(1) As the reference says, the IEEE standard defines comparison so that +0 = -0, rather than -0 < +0. Although it would be possible always to ignore the sign of zero, the IEEE standard does not do so. When a multiplication or division involves a signed zero, the usual sign rules apply in computing the sign of the answer.

Operations like divmod, atan2 exhibits this behavior. In fact, atan2 complies with the IEEE definition as does the underlying "C" lib. See reference #2 for definition.

>>> divmod(-0.0,100)
(-0.0, 0.0)
>>> divmod(+0.0,100)
(0.0, 0.0)

>>> math.atan2(0.0, 0.0) == math.atan2(-0.0, 0.0)
True 
>>> math.atan2(0.0, -0.0) == math.atan2(-0.0, -0.0)
False

One way is to find out through the documentation, if the implementation complies with IEEE behavior . It also seems from the discussion that there are subtle platform variations too.

How ever this aspect(IEEE definition compliance) has not been respected every where. See the rejection of PEP 754 (#3) due to disinterest! I am not sure if this was picked up later.

references :

  1. http://docs.sun.com/source/806-3568/ncg_goldberg.html#924
  2. FPTAN in http://en.wikipedia.org/wiki/Atan2
  3. http://www.python.org/dev/peps/pep-0754/

Why [float.MaxValue == float.MaxValue + 1] does return true ?

16 votes

I wonder if you could explain the Overflow in floating-point types.

float.MaxValue == float.MaxValue + 1 // returns true

Because the 1 is way too small to make a dent in the float.MaxValue value.

Anything less than 1e32 will fall below the precision of the float, so it's in effect the same as adding a zero.

Edit:

ulrichb showed that a value of 1e23 does actually affect float.MaxValue, which has to mean that you are not comparing floats at all, but doubles. The compiler converts all values to doubles before adding and comparing.

Evil in the python decimal / float

14 votes

I have a large amount of python code that tries to handle numbers with 4 decimal precision and I am stuck with python 2.4 for many reasons. The code does very simplistic math (its a credit management code that takes or add credits mostly)

It has intermingled usage of float and Decimal (MySQLdb returns Decimal objects for SQL DECIMAL types). After several strange bugs coming up from usage, I have found root cause of all to be a few places in the code that float and Decimals are being compared.

I got to cases like this:

>>> from decimal import Decimal
>>> max(Decimal('0.06'), 0.6)
Decimal("0.06")

Now my fear is that I might not be able to catch all such cases in the code. (a normal programmer will keep doing x > 0 instead of x > Decimal('0.0000') and it is very hard to avoid)

I have come up with a patch (inspired by improvements to decimal package in python 2.7).

import decimal
def _convert_other(other):
     """Convert other to Decimal.

     Verifies that it's ok to use in an implicit construction.
     """
     if isinstance(other, Decimal):
         return other
     if isinstance(other, (int, long)):
         return Decimal(other)
     # Our small patch begins
     if isinstance(other, float):
         return Decimal(str(other))
     # Our small patch ends
     return NotImplemented
decimal._convert_other = _convert_other

I just do it in a very early loading library and it will change the decimal package behavior by allowing for float to Decimal conversion before comparisons (to avoid hitting python's default object to object comparison).

I specifically used "str" instead of "repr" as it fixes some of float's rounding cases. E.g.

>>> Decimal(str(0.6))
Decimal("0.6")
>>> Decimal(repr(0.6))
Decimal("0.59999999999999998")

Now my question is: Am I missing anything here? Is this fairly safe? or am I breaking something here? (I am thinking the authors of the package had very strong reasons to avoid floats so much)

I think you want raise NotImplementedError() instead of return NotImplemented, to start.

What you're doing is called "monkey patching", and is OK to do, so long as you know what you're doing, are aware of the fallout, and are OK with that fallout. Generally you limit this to fixing a bug, or some other change where you know you're alteration of the behavior is still correct and backwards compatible.

In this case, because you're patching a class, you can change behavior outside of the cases where you use it. If another library uses decimal, and somehow relies on the default behavior, it might cause subtle bugs. The trouble is you don't really know unless you audit all your code, including any dependencies, and find all the call sites.

Basically - do it at your own risk.

Personally I find it more reassuring to fix all my code, add tests, and make it harder to do the wrong thing (e.g., use wrapper classes or helper functions). Another approach would be to instrument your code with your patch to find all the call sites, then go back and fix them.

Edit - I guess I should add that the probable reason they avoided floats is floats can't accurately represent all numbers, which is important if you're dealing with money.