IBM Floating Point Architecture

IBM Floating Point Architecture

IBM System/360 computers, and subsequent machines based on that architecture (mainframes), support a hexadecimal floating-point format. The format is used by SAS Transport files as required by the Food and Drug Administration (FDA) for New Drug Application (NDA) study submissions. See TS-140 [] . It is also used in GRIB data files to exchange the output of weather prediction models, and in GDSII stream format files.

Hexadecimal floating-point uses a similar approach to IEEE 754 binary floating-point, but with many differences. The significand is longer, and the exponent is shorter.

Single-precision 32 bit

A single-precision binary floating-point number is stored in a 32 bit word:

1 7 24 width in bits +-+-------+------------------------+
S| Exp | Fraction
+-+-------+------------------------+ 31 30 24 23 0 bit index (0 on right) bias +64

Note that in this format the initial bit is not suppressed, and theradix point is set to the left of the mantissa in increments of 4 bits.

An Example

Let us decode the number −118.625 using the IBM floating point system.

We need to get the sign, the exponent and the fraction.

Because it is a negative number, the sign is "1". Let's find the others.

First, we write the number (without the sign) using binary notation. Look at binary numeral system to see how to do it. The result is 1110110.101

Now, let's move the radix point left, moving four bits at a time (because exponents are to the power of 16, not 2): 1110110.101=.01110110101·162

The fraction is the part at the right of the radix point, filled with 0 on the right until we get all 24 bits. That is 011101101010000000000000.

The exponent is 2, but we need to convert it to binary and bias it (so the most negative exponent is 0, and all exponents are non-negative binary numbers). For the system/360 format, the bias is 64 and so 2 + 64 = 66. In binary, this is written as 1000010.

Putting them all together:

1 7 24 width in bits +-+-------+------------------------+
S| Exp | Fraction
+-+-------+------------------------+ 31 30 24 23 0 bit index (0 on right) bias +64

Double-precision 64 bit

Double-precision is the same except that the mantissa (fraction) field is wider:

1 7 56 bits +-+-----------+----------------------------------------------------+
S| Exp | Fraction
+-+-----------+----------------------------------------------------+ 63 62 56 55 0

The bias is 64 because the exponent is to the power of 16. Even though the base is 16, the exponent in this form is slightly smaller than the equivalent in IEEE 754.

See, for example: [ Schwarz, "CMOS floating-point unit for the S/390 Parallel Enterprise Server G4"]

Since 1998, IBM mainframes have also included binary floating-point units which conform to IEEE 754. Decimal floating-point was added to IBM_System_z9 GA2 in millicode and in 2008 to the IBM_System_z10 in hardware. Now IBM mainframes support three floating-point radices with 3 HFP formats, 3 BFP formats, and 3 DFP formats. There are two floating-point units per core; one supporting HFP and BFP, and one supporting DFP; note there is one register file, FPRs, which holds all 3 formats.

Systems which use Base-16 Excess-64 Floating Point format

# IBM System/360
# GEC 4000 series minicomputers
# Interdata 16 and 32 bit computers.

Wikimedia Foundation. 2010.