Best python-3.x questions in May 2012

str performance in python

68 votes

While profiling a piece of python code (python 2.6 up to 3.2), I discovered that the str method to convert an object (in my case an integer) to a string is almost an order of magnitude slower than using string formatting.

Here is the benchmark

>>> from timeit import Timer
>>> Timer('str(100000)').timeit()
0.3145311339386332
>>> Timer('"%s"%100000').timeit()
0.03803517023435887

Does anyone know why this is the case? Am I missing something?

'%s' % 100000 is evaluated by the compiler and is equivalent to a constant at run-time.

>>> import dis
>>> dis.dis(lambda: str(100000))
  8           0 LOAD_GLOBAL              0 (str)
              3 LOAD_CONST               1 (100000)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE        
>>> dis.dis(lambda: '%s' % 100000)
  9           0 LOAD_CONST               3 ('100000')
              3 RETURN_VALUE        

% with a run-time expression is not (significantly) faster than str:

>>> Timer('str(x)', 'x=100').timeit()
0.25641703605651855
>>> Timer('"%s" % x', 'x=100').timeit()
0.2169809341430664

Do note that str is still slightly slower, as @DietrichEpp said, this is because str involves lookup and function call operations, while % compiles to a single immediate bytecode:

>>> dis.dis(lambda x: str(x))
  9           0 LOAD_GLOBAL              0 (str)
              3 LOAD_FAST                0 (x)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE        
>>> dis.dis(lambda x: '%s' % x)
 10           0 LOAD_CONST               1 ('%s')
              3 LOAD_FAST                0 (x)
              6 BINARY_MODULO       
              7 RETURN_VALUE        

Of course the above is true for the system I tested on (CPython 2.7); other implementations may differ.

Python 3.x rounding behavior

15 votes

I was just re-reading What’s New In Python 3.0 and it states:

The round() function rounding strategy and return type have changed. Exact halfway cases are now rounded to the nearest even result instead of away from zero. (For example, round(2.5) now returns 2 rather than 3.)

and the documentation for round:

For the built-in types supporting round(), values are rounded to the closest multiple of 10 to the power minus n; if two multiples are equally close, rounding is done toward the even choice

So, under v2.7.3:

In [85]: round(2.5)
Out[85]: 3.0

In [86]: round(3.5)
Out[86]: 4.0

as I'd have expected. However, now under v3.2.3:

In [32]: round(2.5)
Out[32]: 2

In [33]: round(3.5)
Out[33]: 4

This seems counter-intuitive and contrary to what I understand about rounding (and bound to trip up people). English isn't my native language but until I read this I thought I knew what rounding meant :-/ I am sure at the time v3 was introduced there must have been some discussion of this, but I was unable to find a good reason in my search.

  1. Does anyone have insight into why this was changed to this?
  2. Are there any other mainstream programming languages (e.g., C, C++, Java, Perl, ..) that do this sort of (to me inconsistent) rounding?

What am I missing here?

UPDATE: @Li-aungYip's comment re "Banker's rounding" gave me the right search term/keywords to search for and I found this SO question: Why does .NET use banker's rounding as default?, so I will be reading that carefully.

Python 3.0's way is considered the standard rounding method these days, though some language implementations aren't on the bus yet.

The simple "always round 0.5 up" technique results in a slight bias toward the higher number. With large numbers of calculations, this can be significant. The Python 3.0 approach eliminates this issue.

Your puzzlement may derive from a misconception that there is only one method of rounding. IEEE 754, the international standard for floating-point math, defines five different rounding methods (the one used by Python 3.0 is the default). And there are others.

You're not alone, however; this behavior is not as widely known as it ought to be. AppleScript was, if I remember correctly, an early adopter of this rounding method. The round command in AppleScript actually does offer several options, but round-toward-even is the default as it is in IEEE 754. Apparently the engineer who implemented the round command got so fed up with all the requests to "make it work like I learned in school" that he implemented just that: round 2.5 rounding as taught in school is a valid AppleScript command. :-)