0.1 + 0.2
Ask your favorite programming language what 0.1 + 0.2 is. What does it say?
Here are the answers from the three most-used languages, plus a bonus one:
Language | 0.1 + 0.2 |
---|---|
Python | 0.30000000000000004 |
Java | 0.30000000000000004 |
JavaScript | 0.30000000000000004 |
Roc | 0.3 |
The first time I saw 0.30000000000000004 as the answer to 0.1 + 0.2, I was unpleasantly surprised. Once I got used to seeing it from so many languages, the surprise wore off; now I'm pleasantly surprised if I see 0.3 instead!
So why doesn't every language answer 0.3? Why is the most common answer 0.30000000000000004, and what are the pros and cons of having the answer be 0.3 instead?
The tradeoffs aren't as straightforward as they might seem!
Base-10 and Base-2
Like most programming languages, Python, Java, and JavaScript all compile their decimal number literals into base-2 floating-point form. This runtime representation has serious performance benefits (modern CPUs have dedicated instructions for it), but its imprecise representation of base-10 numbers is a notorious source of bugs—especially when money is involved.
Many fractions can be precisely represented in base-10. For
example, the fraction
can be precisely represented in base-10 as 0.1
.
However, if we want to represent the fraction
in base-10 format, we can write down
0.33
, or 0.3333
, or maybe even
0.3333333
—but no matter how many digits we use,
whatever decimal number we write down won't equal
exactly. Some precision has been lost in that conversion
from fractional to decimal representation!
It's common knowledge that although , if we convert those fractions to decimal representations before adding them—meaning we end up adding 0.3333 + 0.6666—we end up getting something just short of 1. Using an ordinary calculator app to work with fractions often results in that sort of precision loss.
The same flavor of precision loss that happens when we convert to a decimal (resulting in 0.3333…, so not exactly ) also happens when we convert to base-2: we end up with a number that's not exactly even though it might be as close as we could get. This might not be as familiar a source of precision loss as converting to decimal (base-10), but it's the same basic idea!
Precision loss outside of calculations
For a long time after I found out about the 0.1 + 0.2 imprecision, I thought that the numbers 0.1 and 0.2 were being represented precisely, and it was the addition that was causing the problem. As it turns out, that's not true; in all of those languages, writing down the numbers 0.1 and 0.2 does not result in and getting stored in memory. Precision loss has already happened!
Here's what 0.1, 0.2, and 0.3 look like in memory if you ask one of those top three languages to print out base-10 representations of the digits it's storing.
0.1 |
0.100000000000000005551115123125782702118158340454101562
|
0.2 |
0.200000000000000011102230246251565404236316680908203125
|
0.3 |
0.299999999999999988897769753748434595763683319091796875
|
When you add up those approximations of 0.1 and 0.2, the answer is:
0.3000000000000000444089209850062616169452667236328125
Not only is this not the three-tenths we'd like to get from adding one-tenth plus two-tenths (just like how 0.9999999999999 is not the answer we'd like to get from adding one-third plus two-thirds, but that's what a decimal calculator will tell us), it's also not equal to the approximation we saw in the table for the decimal 0.3, which got translated to base-2 representation as 0.29999999...
This is why 0.1 + 0.2 == 0.3 returns false in so many languages. When you ask the program if 0.1 + 0.2 is equal to 0.3, what it's actually comparing is:
0.100000000000000005551115123125782702118158340454101562 +
0.200000000000000011102230246251565404236316680908203125 ==
0.299999999999999988897769753748434595763683319091796875
This evaluates to false because adding the first two numbers
gives
0.3000000000000000444089209850062616169452667236328125
and not
0.299999999999999988897769753748434595763683319091796875
, which is what the compiler translates the source code of
0.3
into at runtime.
Even knowing what's going on behind the scenes, some results of base-2 floating-point calculations can still be surprising. For example, if you ask one of these same languages whether 0.1 + 0.1 == 0.2, it calculates:
0.100000000000000005551115123125782702118158340454101562 +
0.100000000000000005551115123125782702118158340454101562 ==
0.200000000000000011102230246251565404236316680908203125
You might expect this to also return false, because the last digit of the first two numbers is 2, and there's no way adding a pair of 2s together will equal 5 (the last digit of the third number). However, it returns true!
This happens because even the printouts in the table above (which were taken directly from programming language printouts) have lost some precision compared to what's in memory! When languages print out base-2 numbers as base-10 decimals, they typically round off the last digit.
So not only is there precision loss when converting from the source code representation of "0.1" to the in-memory base-2 representation (which is only an approximation of one-tenth), there's also additional precision loss when converting that in-memory base-2 representation back into a base-10 format for printing to the screen.
All this precision loss contributes to the general recommendation to avoid using base-2 floats for base-10 calculations involving money, even though most languages make base-2 the default.
Hiding precision loss
Earlier, we saw how adding the base-2 representations of 0.1 and 0.2 led to this answer in memory:
0.3000000000000000444089209850062616169452667236328125
However, none of the languages we looked at earlier printed out that entire number. Instead they all printed a truncated version of it: 0.30000000000000004—so, an approximation of the approximation, meaning even more precision loss.
Other languages truncate even more digits during printing. For example, in C# if you print the answer to 0.1 + 0.2, it will show 0.3. Same with C++. This is not because they're getting a more precise answer—it's the same floating-point operation that Python, Java, and JavaScript are doing, and it puts the same imprecise answer in memory—it's that they hide so many digits when printing out 0.3000000000000000444089209850062616169452667236328125 that only the 0.3 part is visible.
Hiding this many digits obscures the fact that any precision was lost at all! This results in self-contradictory output like (0.1 + 0.2) printing out 0.3, yet (0.1 + 0.2 == 0.3) evaluating to false. (This happens in both C++ and C#.) You can always opt into printing more digits after the decimal point, which can reveal the precision loss, but this default formatting can give you an an inaccurate mental model of the relationships between these numbers in the program.
Compile-time vs. runtime
Go takes a different approach. When you write 0.1 + 0.2 in Go,
that expression gets evaluated to 0.3 at compile time with no
precision loss. If you evaluate 0.1 + 0.2 == 0.3, you'll get
true
. However, if you use variables instead of
constants—that is, set a := 0.1 and b := 0.2 and print a +
b—you'll once again see 0.30000000000000004. This is because
Go evaluates variables at runtime differently from constants at
compile time; constants use precise (but slower) base-10
addition, whereas variables use the same base-2 floating-point
math as Python, Java, JavaScript, and so on.
In contrast to all these designs, when Roc prints 0.3, it's because the 0.1 + 0.2 operation is consistently doing base-10 arithmetic rather than the more common base-2. The reason you aren't seeing precision loss is that there isn't any.
Floating-Point versus Fixed-Point
It's not that Roc only supports base-10 arithmetic. It also supports the typical base-2 floating-point numbers, because in many situations the performance benefits are absolutely worth the cost of precision loss. What sets Roc apart is its choice of default; when you write decimal literals like 0.1 or 0.2 in Roc, by default they're represented by a 128-bit fixed-point base-10 number that never loses precision, making it reasonable to use for calculations involving money.
In Roc, floats are opt-in rather than opt-out.
C#'s floating-point base-10 System.Decimal
The C# standard library has its own
128-bit decimal type, although it isn't used by default when you write normal
decimal literals. (The same is true of F#, which is a functional
language that shares part of its standard library with C#.) This type is called
System.Decimal
, and it has some important differences to Roc's base-10
Dec
type.
One difference is that whereas Roc's Dec
type
uses a fixed-point representation instead of
floating-point, System.Decimal
is still
floating-point—just base-10 floating-point instead of base-2
floating-point. One difference between fixed-point and
floating-point is how many digits each can represent before and
after the decimal point.
Type | Representation | Digits |
Dec
(Roc)
|
fixed-point, base-10 |
20 before the dot,
18 after the dot |
System.Decimal
(C#)
|
floating-point, base-10 | 28-29 total |
64-bit float (aka double ) |
floating-point, base-2 | ~15-17 total |
It's possible to have a fixed-point, base-2 representation, but we won't discuss that combination because it's so rarely used in practice.
Roc's fixed-point base-10 Dec
Fixed-point decimals have a hardcoded number of digits they support before the decimal point, and a hardcoded number of digits they support after it.
For example, Dec
can always represent 20 full
digits before the decimal point and 18 full digits after it,
with no precision loss. If you put
11111111111111111111.111111111111111111 +
22222222222222222222.222222222222222222 into roc repl, it
happily prints out 33333333333333333333.333333333333333333.
(Each of those numbers has 20 digits before the decimal point
and 18 after it.) You can even put 20 nines followed by 18 nines
after the decimal point! However, if you try to put 21 nines
before the decimal point, you'll get an error.
In contrast, System.Decimal
is floating-point. This means it
cares about the
total number of digits in the number, but it
doesn't care how many of those digits happen to be before or
after the decimal point. So you can absolutely put 21 nines into
a C# System.Decimal
, with another 7 nines after the decimal
point if you like. For that matter, you can also put 7 nines
before the decimal point and then 21 nines after it. Dec's
fixed-point representation doesn't support either of those
numbers!
Precision loss in floating-point base-10
This upside of System.Decimal
's floating-point
representation comes at the cost of potential precision loss in
some arithmetic operations. Here's what
11111111111111111111.11111111111111111
+
44444444444444444444.44444444444444444
evaluates to
in the two different representations:
Dec (fixed-point) |
55555555555555555555.55555555555555555 |
System.Decimal (floating-point) |
55555555555555555555.555555555 |
Silent precision loss strikes again! This happens because
System.Decimal
has a hardcoded limit on how many
total digits it can represent, and this answer requires
more total digits than it supports. The extra digits are
truncated.
Of course, there are also numbers too big for
Dec
to represent. It supports adding
11111111111111111111.11111111111111111 +
22222222222222222222.22222222222222222 with no problem, but if
you add 99999999999999999999.99999999999999999 +
99999999999999999999.99999999999999999 instead, you'll get
the same outcome you would if you added two integers that
together formed an answer too big for that integer size: an
overflow runtime error.
This demonstrates a tradeoff between fixed-point versus
floating-point representations, which is totally separate from
base-2 versus base-10: fixed-point means addition overflow is
necessarily an error condition, whereas floating-point can
reasonably support two potential designs: overflowing to an
error, or to silent precision loss. (System.Decimal
could have
treated addition overflow as an error, but it goes with the
silent precision loss design instead.)
Precision loss in fixed-point base-10
Silent precision loss can happen in fixed-point too. For example, both fixed-point and floating-point typically choose precision loss over crashing when division gives an answer that can't be represented precisely. If you divide 1.0 by 3.0, basically any base-10 or base-2 number system will quietly discard information regardless of fixed-point or floating-point. (There are number types which can support dividing 1 by 3 without precision loss, such as Clojure's Ratio or Scheme's rational, which of course have their own tradeoffs.)
Fortunately, there's a generally high awareness of the edge case of precision loss during division—even compared to other common edge cases like division by zero—so it tends not to result in as many bugs in practice as edge cases where there isn't as much awareness.
That said, it can happen in multiplication too; if you put
0.000000001 * 0.0000000001 into
roc repl
, it will print 0 because the 18 digits after the decimal point
are all zeroes, and it only retains 18 digits after the decimal
point. The more times a decimal gets multiplied by another
decimal, the more likely it is that precision loss will
occur—and potentially in ways where the answer varies based on
order of operations. For example:
0.123456789 * 100000000000 * 0.00000000001
0.123456789 * 0.00000000001 * 100000000000
If you put these both into roc repl
, the first
correctly returns 0.123456789
(100000000000 *
0.00000000001 equals 1). However, the second returns
0.1234567
instead; it lost some precision because
it first multiplied by such a small number that it ran out of
digits after the decimal point and lost information. Multiplying
by a larger number after the fact can't bring that lost
information back!
This might sound like a situation where "crash rather than
losing precision" would be preferable, but then you'd
end up in situations where dividing a number by 3 works fine (at
the cost of imprecision) but multiplying by (1/3
)
crashes. There are many cases where that isn't an
improvement!
Situations like this can come up in practice when multiplying many very
small (or very large) numbers, such as in scientific computing.
In these cases, floating-point can be preferable to fixed-point because
the number of digits after the decimal point can expand to avoid precision loss.
However, base-10 tends not to be a relevant improvement over base-2
in scientific computing, so base-2 floating-point is usually preferable over
Dec
there anyway for its hardware-accelerated performance.
All of these are examples of why there isn't an obvious "best" one-size-fits-all choice for decimal numbers. No matter what you choose, there are always relevant tradeoffs!
the Heap
A design that often comes up in discussions of overflow and
precision loss is "arbitrary-size" numbers, which
switch to using heap allocations on overflow in order to store
larger numbers instead of crashing.
Of course, "arbitrary-size" numbers can't
actually represent arbitrary numbers because all but a
hilariously small proportion of numbers in mathematics are too
big to fit in any computer's memory. Even 1 divided by 3
can't be represented without precision loss in base-10 or
base-2 format, no matter how much memory you use in the attempt.
Still, overflowing to the heap can raise the maximum number of digits from "so high that overflow will only occur if you're working with numbers in the undecillion range or there's an infinite loop bug or something" to "a whole lot higher than that, just in case." Naturally, how often that distinction comes up in practice varies by use case.
A major reason not to allow numbers to automatically overflow to the heap is that supporting this slows down all arithmetic operations, even if the overflow never comes up in practice. So to be worth that cost, the overflow has to come up often enough—and the overflow-to-heap has to be beneficial enough when it does come up—in practice to compensate for slowing down all arithmetic.
This is one reason many languages (including Roc) choose not to support it: essentially all programs would pay a performance cost, and only a tiny number of outlier programs would see any benefit.
Performance
Speaking of performance, how do fixed-point and floating-point base-10 and base-2 numbers perform?
Representations
Here, performance depends on how things are represented. Let's look at these three types from earlier:
Type | Representation | Digits |
Dec (Roc) |
fixed-point, base-10 |
20 before the dot,
18 after the dot |
System.Decimal (C#) |
floating-point, base-10 | 28-29 total |
64-bit float (aka "double") | floating-point, base-2 | ~15-17 total |
Many guides have been written about the base-2 floating-point
representation; I personally like
https://floating-point-gui.de, but there are plenty of others. Likewise, there are various
articles about how System.Decimal
is represented in
memory; I liked
this one.
Without going into the level of depth those articles cover, the main thing these floating-point numbers have in common is that they store:
- An exponent number
- A coefficient number
The way these two stored values get translated into a single number is basically one of these two calculations, depending on whether it's base-2 or base-10:
coefficient * (10 ^ exponent)
coefficient * (2 ^ exponent)
(When reading about floating-point numbers, the term "mantissa" or "significand" is often used instead of "coefficient." All three words are pretty fun.)
Floating-point numbers which follow the
IEEE 754 specification
include special values of
NaN, Infinity,
and -Infinity. NaN is defined in the specification to be not
equal to NaN, which can cause various bugs. As such, Roc gives a
compile-time error if you try to compare two floating-point
numbers with ==
(although it's allowed for
Dec
and integers).
Neither System.Decimal
nor Dec
have
NaN, Infinity, or -Infinity; calculations like division by zero
give a runtime error instead, just like they do for integers.
Base-10 arithmetic
Doing arithmetic on the base-10 version generally involves
converting things to integer representations, doing integer
arithmetic, and then converting those answers back into
coefficient and exponent. This makes essentially every
arithmetic operation on System.Decimal
involve
several CPU instructions. In contrast, base-2 floating-point
numbers benefit from dedicated individual CPU instructions,
making arithmetic much faster.
Roc's
Dec
implementation (largely the work of
Brendan Hansknecht—thanks, Brendan!) is essentially represented in memory as a
128-bit integer, except one that gets rendered with a decimal
point in a hardcoded position. This means addition and
subtraction use the same instructions as normal integer addition
and subtraction. Those run so fast, they can actually outperform
addition and subtraction of 64-bit base-2 floats!
Multiplication and division are a different story. Those require splitting up the 128 bits into two different 64-bit integers, doing operations on them, and then reconstructing the 128-bit representation. (You can look at the implementation to see exactly how it works.) The end result is that multiplication is usually several times slower than 64-bit float multiplication, and performance is even worse than that for division.
Some operations, such as sine, cosine, tangent, and square root, have not yet been implemented for Dec. (If implementing any of those operations for a fixed-point base-10 representation sounds like a fun project, please get in touch! People on Roc chat always happy to help new contributors get involved.)
Choosing a Default
So if a language supports both base-2 floating-point and base-10 fixed-point decimals, which should be used when you put in 0.1 + 0.2? Which is the better default for that language?
Type inference
Many languages determine what numeric types to use based on syntax. For example, in C#, 5 is always an integer, 5.0 is always a float, and 5.0m is always a decimal. This means that if you have a function that takes a float, you can't write 5 as the argument; you have to write 5.0, or you'll get an error. Similarly, if a function takes a decimal, you can't write 5 or 5.0; you have to write 5.0m.
Roc doesn't have this requirement. If a function takes a number—whether it's an integer, a floating-point base-2 number, or a Dec—you can always write 5 as the number you're passing in. (If it's an integer, you'll get a compiler error if you try to write 5.5, but 5.5 will be accepted for either floats or decimal numbers.)
Because of this, it's actually very rare in practice that you'll write 0.1 + 0.2 in a .roc file and have it use the default numeric type of Dec. Almost always, the type in question will end up being determined by type inference—based on how you ended up using the result of that operation.
For example, if you have a function that says it takes a
Dec
, and you pass in (0.1 + 0.2), the compiler will
do Dec addition and that function will end up receiving 0.3 as
its argument. However, if you have a function that says it takes
F64
(a 64-bit base-2 floating-point number), and you write (0.1 +
0.2) as its argument, the compiler will infer that those two
numbers should be added as floats, and you'll end up passing
in the not-quite-0.3 number we've been talking about all
along. (You can also write number literals like
12.34f64
to opt into
F64
representation without involving type
inference.)
Constant expressions
Earlier, we noted that Go does something different than this. Go always evaluates constant expressions (like "0.1 + 0.2", which can be computed entirely at compile time with no need to run the program) using full decimal precision. So even if a Go function says it takes a float, the compiler will end up treating the source code of "0.1 + 0.2" as if the source code had been "0.3" when calling that function.
An upside of this design is that there's less precision loss. A downside is that it sacrifices substitutability; if you have some Go code that takes user inputs of 0.1 and 0.2 and computes an answer, and you try to substitute hardcoded values of 0.1 and 0.2 (perhaps in a particular special-case path that can improve performance by skipping some operations), you'll potentially get a different answer than the user-input version. A substitution that might seem like a harmless refactor can actually change program behavior.
It's for this exact reason that Roc's design intentionally does the same thing regardless of whether the numbers are known at compile time.
When the default comes up
So given that, and given that Roc usually infers which operation to do based on type information anyway…when does the default actually come up? When are you ever writing 0.1 + 0.2 in a situation where type inference wouldn't override the default anyway?
One situation is where your whole program doesn't have type
annotations, and none of your fractional numbers end up being
used with outside code (such as libraries, which might request
specific fractional types like Dec
or
F64
). Roc has full type inference, so you never
need to write type annotations, and although it's common
practice to annotate top-level functions, you don't have to.
If you're writing a quick script, you might choose not to
annotate anything. In that scenario, Dec
seems like
the best default because it involves the least precision loss,
and arithmetic performance is very unlikely to be noticeable.
(If it is, you can of course opt into floats instead.)
Another situation is for beginners who are new to programming.
If they've learned decimal math in school but have never
encountered base-2 math, they would be in for some confusing
experiences early on if they were playing around in the
REPL and encountered
unexpected inequalities like 0.1 + 0.2 not being equal to 0.3.
Choosing Dec
as the default means decimal math in
Roc works by default the way most people know from using
calculators, and beginners can learn about other representations
and their tradeoffs later on.
For these reasons, Dec
seems like the best choice
for Roc's default representation of fractional numbers!
Try Roc
If you're intrigued by these design choices and want to give Roc a try for yourself, the tutorial is the easiest way to get up and running. It takes you from no Roc experience whatsoever to building your first program, while explaining language concepts along the way.
I also highly recommend dropping in to say hi on Roc Zulip Chat. There are lots of friendly people on there, and we love to help beginners get started!