### Diving into a floating point bug

March 2nd, 2017
tech  [html]
Yesterday I refactored some python code that effectively changed:
```   floor(n * (100 / 101))
```
into:
```   floor(n * Decimal(100/101))
```
For most integers this doesn't change anything, but for multiples of 101 it gives you different answers:
```  floor(101 * (100 / 101))       -> 100
floor(101 * Decimal(100/101))  ->  99
```
In python2, however, both give 100:
```  floor(101 * (100.0 / 101))       -> 100
floor(101 * Decimal(100.0/101))  -> 100
```
What's going on? First, what's the difference between python2 and python3 here? One of the changes not listed in What's New In Python 3.0 is that `floor` now returns an `int` instead of a `float`:
• 2.x: Return the floor of `x` as a `float`, the largest integer value less than or equal to `x`.
• 3.x: Return the floor of `x`, the largest integer less than or equal to `x`. If `x` is not a `float`, delegates to `x.__floor__()`, which should return an `Integral` value.
Since these are positive numbers, we can use `int()` instead of `floor()` and now python2 and python3 give the same result:
```  2.x: int(101 * Decimal(100.0/101)) -> 99
3.x: int(101 * Decimal(100/101))   -> 99
```
The root of the problem is that `100/101` is a repeating decimal, `0.99009900...`, and so can't be represented exactly as a `Decimal` or a `float`. To store it as a `Decimal` python defaults to 28 decimal digits and rounds it:
```> D(100)/D(101)
Decimal('0.9900990099009900990099009901')
```
Storing it as a `float` is more complicated. Python uses the platform's hardware support, which is IEEE 754 floating point bascially everywhere, and it gets converted to a binary fraction with 53 bits for the numerator and a power of two for the denominator:
```  1114752383012499 / 2**50
```
Python is aware that there are many numbers for which this is the closest possible floating point approximation, and so to be friendly it selectes the shortest decimal representation when printing it:
```> 100/101
0.9900990099009901
```

Now we have all the pieces to see why `101*Decimal(100/101)` is less than `101*(100/101)`, and, critically, why one is just under 100 while the other is exactly 100.

In the `Decimal` case python first divides 100 by 101 and gets `1114752383012499 / 2**50`. Then it takes the closest representable `Decimal` to that number, which is `Decimal('0.99009900990099009021605525049380958080291748046875')`. That's 50 bits of precision because `Decimal` tries to track and preseve precision in its calculations, and 53 bits of precision is 50 decimal digits of precision. [Update 2017-03-02: this is wrong: 53 bits of precision much less than 50 decimal digits. I'm not sure why `Decimal` is doing this.] All of the digits starting with `02160...` are just noise, but `Decimal` has no way of knowing that. Then when we multiply by `101` we get `Decimal('99.999...58030')`, because our `Decimal` version of the `float` was slightly less than true `100/101`.

On the other hand, when python gets `101 * (100/101)` it stays in IEEE 754 land. It multiplies `101` by ```1114752383012499 / 2**50```, and the closest `float` to that is exactly `100`.

I ran into this bug because I had been trying to break a large change into two behavior-preserving changes, first switching some code to use `Decimal` and then switching the rest. To fix the bug, I combined the two changes into one, so that we switched from using `float` and `int` to using `Decimal` throughout this module.

Comment via: google plus, facebook, hacker news

### Recent posts on blogs I like:

#### The Gift of It's Your Problem Now

Recently a security hole in a certain open source Java library resulted in a worldwide emergency kerfuffle as, say, 40% of the possibly hundreds of millions of worldwide deployments of this library needed to be updated in a hurry. (The other 60% also …

via apenwarr January 1, 2022

#### The container throttling problem

This is an excerpt from an internal document David Mackey and I co-authored in April 2019. The document is excerpted since much of the original doc was about comparing possible approaches to increasing efficency at Twitter, which is mostly information tha…

via Posts on December 18, 2021

#### Experiences in raising children in shared housing

Sometimes I see posts about people’s hope to raise children in a group housing situation, and it often seems overly optimistic to me. In particular they seem to expect that there will be more shared childcare than I think should be expected. Today I talke…

via The whole sky October 18, 2021