Floating point follies

While talking about exception handling, Roy ran up against a change in behavior from VB6 to VB.NET. In VB6, if you evaluate the expression “x = x / 0”, you’ll immediately get a divide by zero exception. In VB.NET, if you evaluate the same expression, no exception is thrown. Instead the value of x is now NaN, also known as “Not a Number.” In reality, you could get this behavior in VB6 as well – on the Compile tab of the project properties, there’s an “Advanced Optimizations” button that includes an option “Remove Floating Point Error Checks.” If you check that and make an exe (it doesn’t affect F5 in the IDE), then the assignment will work and x will contain the value NaN.

The reason for all this is that floating point numbers (as defined by the IEEE) support a number of special values that aren’t usually surfaced to VB programmers. In addition to NaN, a floating point variable can also contain the values positive and negative infinity. Not being a numerical expert, I won’t even pretend to explain what these values are for or why the IEEE felt they were important to include as part of the definition of floating point numbers. The bottom line is that they are there. By default in VB6, though, we inserted X86 instructions that checked for the various non-numerical states and would throw an exception if you encountered one. In VB.NET, however, things were complicated by the way that the CLR IL was designed, namely the fact that it doesn’t support a really efficient way to check for non-numerical results. In VB 2002, we tried simulating the VB6 behavior by emitting CKFINITE opcodes after each floating point operation, but this absolutely tanked all floating point performance. So we ended up dropping the VB6 behavior, and now floating point division by zero will result in NaN instead of an exception.

A related issue is that most processors’ FPU registers can handle floating point numbers with a higher precision than can be stored in the standard .NET Double datatype. As a result, depending on how the CLR JIT compiler enregisters variables and temporary values, it’s possible that floating point operations (including comparisons) can be done at differing levels of precision. For example, the following code could print False instead of True:

Dim d1, d2 As Double

d1 = Atn(-1)
d2 = Atn(-1)

If d1 = d2 Then
Console.WriteLine("True")
Else
Console.WriteLine("False")
End If

That’s because one of the variables could be left in a FPU register while the other variable is actually stored back on the stack. When you compare the two values, they aren’t equal because one of the values was truncated (so it could be stored on the stack) and the other value wasn’t. By default, VB6 suppressed this behavior by emitting X86 code to truncate values that were at higher precisions. (You can turn this off on the advanced optimizations page, too, under the option “Allow Unrounded Floating Point Operations.”) But the CLR equivalent – inserting CONV.R8 opcodes everywhere a higher precision value might be used – had similar performance problems to the NaN checks.

As a result, VB.NET floating point operations are much “closer to the metal,” which is either a good thing or a bad thing depending on your perspective. You definitely have to be a little more aware of what you’re doing when dealing with floating point numbers…

6 thoughts on “Floating point follies

  1. Cory Smith

    I knew about the NaN change… which is pretty cool. However, the other one that you mention… how would you solve the comparision of two values if one is still on the stack and the other has been rounded? In other words, what would be the proper way to handle this type of situation (work around, etc.)?

    Reply
  2. Mark Hurd

    Cory: Use a comparision like you "always should" with floating point.
    That is, the condition f = g should always be coded as Abs(f – g) < Eps for some small Eps.
    Depending upon your application, you may need to test relative differences using Abs((f – g)/f) < Eps.

    In most situations these can be hand optimised.

    Reply
  3. Adam

    The way I see it, you could truncate them beforehand by code. You wouldn’t be as exact when printing them in their values, but the boolean operators would be more efficient.

    Great article by the way! That should help some converters from VB to VB.NET.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *